吉林大学学报(地球科学版) ›› 2019, Vol. 49 ›› Issue (6): 1805-1814.doi: 10.13278/j.cnki.jjuese.20180346

• 地球探测与信息技术 • 上一篇    

滨海湿地物联网观测数据预处理方法

黄盖先, 田波, 周云轩, 袁庆   

  1. 华东师范大学河口海岸学国家重点实验室, 上海 200241
  • 收稿日期:2018-12-27 发布日期:2019-11-30
  • 通讯作者: 田波(1972-),男,副研究员,主要从事海岸带湿地遥感研究,E-mail: btian@sklec.ecnu.edu.cn E-mail:btian@sklec.ecnu.edu.cn
  • 作者简介:黄盖先(1994-),男,硕士研究生,主要从事滨海湿地生态物联网研究,E-mail:51163904010@stu.ecnu.edu.cn
  • 基金资助:
    国家重点研发计划项目(2016YFC0502704);上海市科委科研计划项目(17DZ1201902,18DZ1204802)

Data Preprocessing Method of IoT Observation System in Coastal Wetland

Huang Gaixian, Tian Bo, Zhou Yunxuan, Yuan Qing   

  1. State Key Laboratory of Estuarine and Coastal Research, East China Normal University, Shanghai 200241, China
  • Received:2018-12-27 Published:2019-11-30
  • Supported by:
    Supported by National Key Research and Development Program of China (2016YFC0502704) and Scientific Research Project of Shanghai Science and Technology Commission (17DZ1201902, 18DZ1204802)

摘要: 连续在线滨海湿地生态物联网观测系统,因传感器技术局限及环境干扰会产生异常观测数据,影响数据使用,有效的数据预处理极为重要。以上海崇明东滩国际重要湿地生态观测数据为研究对象,将异常数据分为数值异常、波动异常与异常事件3种类型,基于回归残差概率分布异常检测算法,使用查找表和多指标时间序列模型,综合多环境要素相互关系,构建针对滨海湿地生态观测的数据预处理方法。相比传统方法,该方法在保证异常数据检测精度的同时,更好地区分了异常事件与传感器异常,减少误判。通过分析9个指标5万余条数据,以10-8~10-20的阈值分别检测出0.18%~8.12%的数值异常和波动异常,以及2次异常事件。分析数据预处理结果,传感器的观测原理、观测季节等因素会影响传感器的稳定性,人类活动是造成观测区异常事件发生的主要因素。

关键词: 滨海湿地, 生态物联网, 数据预处理, 多指标时间序列模型

Abstract: Effective data preprocessing is essential to an online coastal wetland ecological internet of things (IoT) observation system. Outliers always occur due to the limitations of measuring methods and harsh environmental conditions, which challenge data applications. Based on the ecological observation data of Chongming Dongtan wetland in Shanghai, the outliers were divided into three types:abnormal values, abnormal fluctuation,and abnormal events. Integrating the interactions between indicators of coastal wetlands, we proposed a preprocessing method for the outliers of the coastal wetland ecological IoT system based on the residual probabilistic outlier detection algorithm, look-up table, and multi-indicator time series model. Compared with the traditional methods, this method can not only ensure the accuracy of outlier detection, but also better distinguish abnormal events from sensor problems to reduce false positives. Through the analysis of more than 50 000 data records of nine indicators, two abnormal events and 0.18%-8.12% abnormal values and abnormal fluctuations were detected with the threshold of 10-8-10-20. Through the analysis of the preprocessed data, we find that the observation principle and observation season will affect the stability of sensors, and the human activities in the observation area are the main factors causing abnormal events.

Key words: coastal wetlands, ecological internet of things, data preprocessing, multi-indicators time series model

中图分类号: 

  • P951
[1] Costanza R,d'Arge R, de Groot R, et al. The Value of the World's Ecosystem Services and Natural Capital[J]. Nature, 1997, 387:253-260.
[2] 宋庆丰,牛香,王兵,等. 基于大数据的森林生态系统服务功能评估进展[J]. 生态学杂志,2015,34(10):2914-2921. Song Qingfeng, Niu Xiang, Wang Bing, et al. Review on Forest Ecosystem Services Assessment Based on Big Data[J]. Chinese Journal of Ecology, 2015, 34(10):2914-2921.
[3] 崔洪亮,于淼,常天英,等. 应用于海洋环境和海洋工程的光纤传感技术[J]. 吉林大学学报(地球科学版),2017,47(1):279-293. Cui Hongliang, Yu Miao, Chang Tianying, et al. Fiber Optic Sensing Technology for Applications in Marine Environment and Marine Engineering[J]. Journal of Jilin University (Earth Science Edition), 2017, 47(1):279-293.
[4] Fiebrich C A, Morgan C R, Mccombs A G, et al. Quality Assurance Procedures for Mesoscale Meteorological Data[J]. Journal of Atmospheric and Oceanic Technology, 2010, 27(10):1565-1582.
[5] Byer D, Carlson K H. Real-Time Detection of Intentional Chemical Contamination in the Distribution System[J]. Journal American Water Works Association, 2005, 97(7):130-133.
[6] Wu H, Tang X, Wang Z, et al. Probabilistic Automatic Outlier Detection for Surface Air Quality Measurements from the China National Environmental Monitoring Network[J]. Advances in Atmospheric Sciences, 2018, 35(12):1522-1532.
[7] 魏媛,冯天恒,黄平捷,等. 管网水质多指标动态关联异常检测方法[J]. 浙江大学学报(工学版),2016,50(7):1402-1409. Wei Yuan, Feng Tianheng, Huang Pingjie, et al. Contamination Event Detection Method Based on Dynamic Correlation Analyis of Multiple Water Quality Parameters[J]. Journal of Zhejiang University (Engineering Science), 2016, 50(7):1402-1409.
[8] Breunig M M, Kriegel H P, Ng R T, et al. LOF:Identifying Density-Based Local Outliers[C]//Acm Sigmod International Conference on Management of Data. Dallas:ACM, 2000:93-104.
[9] Billor N, Hadi A S, Velleman P F. BACON:Blocked Adaptive Computationally Efficient Outlier Nominators[J]. Computational Statistics & Data Analysis, 2000, 34(3):279-298.
[10] Hochenbaum J, Vallis O S, Kejariwal A. Automatic Anomaly Detection in the Cloud via Statistical Learning[J/OL]. arXiv Preprint. (2017-04-24). http://arxiv.org/abs/1704.07706.
[11] Babin S M, Burkom H S, Mnatsakanyan Z R, et al. Drinking Water Security and Public Health Disease Outbreak Surveillance[J]. Johns Hopkins Apl Technical Digest, 2008, 27(4):403-411.
[12] Modaresi F, Araghinejad S. A Comparative Assessment of Support Vector Machines, Probabilistic Neural Networks, and K-Nearest Neighbor Algorithms for Water Quality Classification[J]. Water Resources Management, 2014, 28(12):4095-4111.
[13] Zhang Y, Meratnia N, Havinga P J M. Outlier Detection Techniques for Wireless Sensor Networks:A Survey[J]. IEEE Communications Surveys and Tutorials, 2010, 12(2):159-170.
[14] 地表水环境质量标准:GB 3838-2002[S]. 北京:国家环境保护总局,国家质量监督检验检疫总局,2002. Environmental Quality Standards for Surface Water:GB 3838-2002[S]. Beijing:State of Environmental Protection Agency, General Administration of Quality Supervision, Inspection and Quarantine, 2002.
[15] Feng S, Hu Q, Qian W. Quality Control of Daily Meteorological Data in China, 1951-2000:A New Dataset[J]. International Journal of Climatology, 2004, 24(7):853-870.
[16] Baty F, Ritz C, Charles S, et al. A Toolbox for Nonlinear Regression in R:The Package Nlstools[J]. Journal of Statistical Software, 2015, 66(5):1-21.
[17] Dunn R J H, Willett K M, Thorne P W, et al. HadISD:A Quality-Controlled Global Synoptic Report Database for Selected Variables at Long-Term Stations from 1973-2011[J]. Climate of the Past, 2012, 8(5):1649-1679.
[18] 海水水质标准:GB 3097-1997[S]. 北京:国家环境保护局,1997. Sea Water Quality Standard:GB 3097-1997[S]. Beijing:National Environmental Protection Agency, 1997.
[19] 翟世奎,张怀静,范德江,等. 长江口及其邻近海域悬浮物浓度和浊度的对应关系[J]. 环境科学学报,2005,25(5):693-699. Zhai Shikui, Zhang Huaijing, Fan Dejiang, et al. Corresponding Relationship Between Suspended Matter Concentration and Turbidity on Changjiang Estuary and Adjacent Sea Area[J]. Acta Scientiae Circumstantiae, 2005, 25(5):693-699.
[20] 李修竹,苏荣国,张传松,等. 基于支持向量机的长江口及其邻近海域叶绿素-a浓度预测模型[J]. 中国海洋大学学报(自然科学版),2019,49(1):69-76. Li Xiuzhu, Su Rongguo, Zhang Chuansong, et al. A Chl-a Prediction Model Based on Support Vector Machine in Yangtze River Estuaries and Its Adjacent Sea Area[J]. Periodical of Ocean University of China, 2019, 49(1):69-76.
[21] 王佳鹏,施润和,张超,等. 基于光谱分析的长江口湿地互花米草叶片叶绿素含量反演研究[J]. 遥感技术与应用,2017,32(6):1056-1063. Wang Jiapeng, Shi Runhe, Zhang Chao, et al. Study on the Inversion of Chlorophyll Content of Spartina Alterniflora Leaf in the Yangtze River Estuary Wetland Based on Spectral Analysis[J]. Remote Sensing Technology and Application, 2017, 32(6):1056-1063.
[22] 谢明媚,孙德勇,丘仲锋,等. 长江口水质MERIS卫星数据遥感反演研究[J]. 广西科学,2016,23(6):520-527. Xie Mingmei, Sun Deyong, Qiu Zhongfeng, el at. Water Quality Retrievals from MERIS Satellite Data in Yangtze Estuary[J]. Guangxi Sciences, 2016, 23(6):520-527.
[23] 崔莉凤,黄振芳,刘载文,等. 水华暴发叶绿素-a与表征指标溶解氧和pH的关系[J]. 给水排水,2008,44(增刊1):177-178. Cui Lifeng, Huang Zhenfang, Liu Zaiwen, et al. Relationship Between Chlorophyll-a and Characteristic Indicators of Dissolved Oxygen and pH in Blooms[J]. Water & Wastewater Engineering, 2008, 44(Sup. 1):177-178.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 钦丽娟,曹剑峰,平建华,姜纪沂,王 楠,沈媛媛,李 升. 模糊数学在郑州市水资源价值评价中的应用[J]. J4, 2005, 35(04): 487 -0490 .
[2] 葛玉辉,孙春林,刘茂修. 鄂尔多斯盆地东北缘中侏罗统延安组植物群与古气候分析[J]. J4, 2006, 36(02): 164 -0168 .
[3] 董春艳,马 瑞,迟效国,刘建峰,黎广荣. 差应力与岩石熔融性状关系的实验研究[J]. J4, 2006, 36(02): 177 -0182 .
[4] 杨俊鹏,胡 克,刘玉英. 吉林西部盐碱化土壤碳酸盐的碳稳定同位素特征[J]. J4, 2006, 36(02): 245 -0249 .
[5] 崔 健,林年丰,汤 洁,姜玲玲,蔡 宇. 霍林河流域下游地区土地利用变化动态及趋势预测[J]. J4, 2006, 36(02): 259 -0264 .
[6] 邹新宁,孙 卫,张盟勃,万玉君. 地震属性分析在岩性气藏描述中的应用[J]. J4, 2006, 36(02): 289 -0294 .
[7] 杜春国,邹华耀,邵振军,张俊. 砂岩透镜体油气藏成因机理与模式[J]. J4, 2006, 36(03): 370 -376 .
[8] 祝洪臣,王海坡,张炯飞. 内蒙古苏尼特左旗两种不同成因类型金矿[J]. J4, 2006, 36(05): 759 -766 .
[9] 鲍庆中,张长捷,吴之理,王宏,李伟,桑家和,刘永生. 内蒙古白音高勒地区石炭纪石英闪长岩SHRIMP锆石U-Pb年代学及其意义[J]. J4, 2007, 37(1): 15 -0023 .
[10] 高红梅,高福红,樊馥,高玉巧. 鸡西盆地早白垩世烃源岩可溶有机质地球化学特征[J]. J4, 2007, 37(1): 86 -0090 .