吉林大学学报(工学版) ›› 2022, Vol. 52 ›› Issue (7): 1719-1732.doi: 10.13229/j.cnki.jdxbgxb20210138

• 农业工程·仿生工程 • 上一篇    

基于随机森林特征选择的茶园遥感提取

王斌1,2(),何丙辉1(),林娜3,王伟3,李天阳1   

  1. 1.西南大学 资源环境学院,重庆 400715
    2.重庆市地理信息和遥感应用中心,重庆 401147
    3.重庆交通大学 土木工程学院,重庆 400074
  • 收稿日期:2021-02-22 出版日期:2022-07-01 发布日期:2022-08-08
  • 通讯作者: 何丙辉 E-mail:79603730@qq.com;hebinghui@swu.edu.cn
  • 作者简介:王斌(1979-),男,教授级高工,博士.研究方向:3S技术及其在农业和生态领域的应用. E-mail:79603730@qq.com
  • 基金资助:
    国家自然科学基金项目(41771312);重庆市教委科技项目(KJQN201800747)

Tea plantation remote sensing extraction based on random forest feature selection

Bin WANG1,2(),Bing-hui HE1(),Na LIN3,Wei WANG3,Tian-yang LI1   

  1. 1.College of Resources and Environment,Southwest University,Chongqing 400715,China
    2.Chongqing Geomatics and Remote Sensing Center,Chongqing 401147,China
    3.School of Civil Engineering,Chongqing Jiaotong University,Chongqing 400074,China
  • Received:2021-02-22 Online:2022-07-01 Published:2022-08-08
  • Contact: Bing-hui HE E-mail:79603730@qq.com;hebinghui@swu.edu.cn

摘要:

由于茶园空间分布零散、形状不规则、与周围植被光谱特征接近等原因,从卫星影像中提取茶园非常具有挑战性。针对这一问题,本文研究提出了基于随机森林特征选择方法和Landsat-8 OLI影像进行茶园提取的技术路线,以浙江省安吉县为例,采用春、秋、冬季Landsat-8 OLI影像作为主要数据源,利用随机森林对茶园影像特征进行重要性评估、排序和特征选择,设计了单季节初始特征集、单季节优选特征集、多季节优选特征集,并进行了9组茶园提取实验。结果表明,融合了特征选择和多季节信息优势于一体的多季节优选特征集具有最好的性能表现,其精度如下:生产者精度为87.5%、总体精度为92.4%、Kappa系数为0.897。本文提出的技术路线充分发挥了其在提高遥感分类精度和降低维度等方面的作用,实现了对空间分布离散、形状欠规则的茶园的有效提取。

关键词: 农业工程, 茶园, 随机森林, 特征选择, 遥感, 提取

Abstract:

Due to the scattered spatial distribution, irregular shape and close to the spectral characteristics of surrounding vegetation, it is a very challenging work to extract tea plantations from satellite images. In order to solve this problem, a technical route of tea plantations extraction based on Random Forest feature selection method and Landsat-8 OLI image was proposed. In this study, Anji County in Zhejiang Province was selected as the research area. The spring, autumn and winter landsat-8 OLI images were used as the main data source. The importance evaluation, ranking and feature selection of initial features were carried out using Random Forest. The single season initial feature set, single season optimal feature set and multi-seasonal optimal feature set were designed, and nine groups of tea garden extraction experiments were carried out. The results show that multi-seasonal optimal feature set, which combines multi-seasonal information and feature selection advantages, has the best performance (PA = 87.5%; OA = 92.4%; Kappa = 0.897). The technical route is reliable and practical for extracting tea plantations with scattered spatial distribution and irregular shape, and achieves relatively high accuracy while feature dimensionality reduces.

Key words: agricultural engineering, tea plantations, random forest, feature selection, remote sensing, extraction

中图分类号: 

  • F307.1

图1

安吉县地理位置"

图2

总体技术流程"

表1

特征集设计"

特征集子特征集包含特征特征数量

单季节

初始

(1)光谱特征集光谱波段(波段2‐7)6
(2)GLCM纹理特征集6光谱波段×8个GLCM纹理特征+6个光谱波段54
(3)变差函数纹理特征集6光谱波段×4个变差纹理特征+6个光谱波段30
(4)伪交叉变差函数纹理特征集15对波段组合×4个伪交叉变差纹理特征+ 6光谱波段66
(5)植被指数特征集3个植被指数特征+6个光谱特征9

单季节

优选

(1)GLCM纹理优选特征集GLCM特征集前1/413
(2)变差函数纹理优选特征集

变差函数纹理特征集前

1/4

7
(3)伪交叉变差函数纹理优选特征集伪交叉变差函数纹理特征集前1/416

多季节

优选

(1)多季节光谱特征集3个季节的光谱特征18
(2)多季节综合特征集3个季节的综合特征(9×9窗口)423
(3)多季节优选特征集多季节综合特征集前1/5特征85

图3

三个季节初始特征与特征优选统计图"

图4

茶园与林地、苗圃纹理对比"

图5

茶园反射光谱曲线图"

图6

茶园在OLI影像上的解译标志"

表2

纹理指标"

纹理类别纹理指标公式指标描述说 明
GLCM均值Mean=i=0quantkj=0quantkp(i,j)×i纹理规则程度ij是像元在图像中的行列坐标, pij)是灰度联合概率矩阵,quantk 是灰度共生矩阵的阶数。
方差Variance=i=0quantkj=0quantkp(i,j)×(i-Mean)2像元灰度值与均值的偏差量
对比度Contrast=i=0quantkj=0quantkp(i,j)×(i-j)2纹理清晰度
同质性Homogeneity=i=0quantkj=0quantkp(i,j)×11+(i+j)2纹理均匀性
非相似性Dissmilarity=i=0quantkj=0quantkp(i,j)×i-j纹理对比度
角二阶矩ASM=i=0quantkj=0quantkp(i,j)2灰度分布均匀性
Entropy=-i=0quantkj=0quantkp(i,j)×lnp(i,j)纹理复杂度统计量
相关性Correlation=i=0quantkj=0quantk(i-Mean)×(j-Mean)×p(i,j)2Variance纹理灰度线性关系和方向性统计量
地统计学变差函数gk(h)=12n(h)i=1n(h){dnk(xi)-dnk(xi+h)}2图像局部方差和相关性统计量dn是像元x的灰度值, nh)是相距为h的像元对的数量,k是波段序号。
伪交叉变差函数γjk(h)=12n(h)i=1n(h){dnj(xi)-dnk(xi+h)}2两个波段之间的联合变差函数

图7

随机森林分类树数目参数设置实验"

表3

训练和验证样本像素数量"

土地覆盖类型训练集样本 像素数/个

验证集样本

像素数/个

茶园62302259
水体573942
森林107473095
建筑用地1032683
裸土689348
耕地1281658

表4

各类特征集分类精度"

类别单季节初始特征集单季节优选特征集
PA/%OM/%OA/%Kappa系数PA/%OM/%OA/%Kappa系数
冬季光谱特征集71.328.789.50.857----
GLCM纹理特征集71.428.687.60.83075.724.389.60.858
变差函数纹理特征集69.230.885.90.80770.729.386.70.819
伪交叉变差函数纹理特征集65.234.887.00.82373.526.589.30.855
植被指数特征集72.127.989.70.860----
秋季光谱特征集67.632.486.20.812----
GLCM 纹理特征集77.222.888.20.83975.924.184.60.791
变差函数纹理特征集79.220.886.90.82272.028.080.20.732
伪交叉变差函数纹理特征集79.520.587.80.83478.421.687.40.830
植被指数特征集72.827.287.10.825----
春季光谱特征集61.538.583.10.769----
GLCM 纹理特征集74.125.985.80.80566.633.481.50.745
变差函数纹理特征集77.422.685.10.79564.835.278.20.700
伪交叉变差函数纹理特征集77.422.688.30.84162.737.381.50.748
植被指数特征集63.236.882.50.761----

图8

纹理特征集中各特征相对重要性(9×9像素窗口)"

表5

多季节优选特征集分类精度"

类 别PA/%OM/%OA/%Kappa
多季节光谱特征集78.421.691.90.890
多季节综合特征集86.113.999.30.868
多季节特征选择特征集87.512.592.40.897

图9

茶园提取结果"

图10

多季节优选特征集中各特征的重要性"

图11

不同特征集中茶园误分类的像元数"

表6

本文方法与其他方法精度对比"

PA/%OM/%OA/%
随机森林87.512.592.4
U‐Net网络86.113.990.1
最大似然法76.523.575.8
支持向量机83.616.489.3
1 Liu Y L, Feng Y H, Zhao Z, et al. Socioeconomic drivers of forest loss and fragmentation: a comparison between different land use planning schemes and policy implications[J]. Land Use Policy, 2016, 54: 58-68.
2 Su S L, Zhou X C, Wan C, et al. Land use changes to cash crop plantations: crop types, multilevel determinants and policy implications[J]. Land Use Policy, 2016, 50: 379-389.
3 徐乔,张霄,余绍淮,等. 综合多特征的极化SAR图像随机森林分类算法[J]. 遥感学报, 2019, 23(4): 685-694.
Xu Qiao, Zhang Xiao, Yu Shao-huai, et al. Multi-feature-based classification method using random forest and superpixels for polarimetric SAR images[J]. Journal of Remote Sensing, 2019, 23(4): 685-694.
4 王娜,李强子,杜鑫,等. 单变量特征选择的苏北地区主要农作物遥感识别[J]. 遥感学报, 2017, 21(4): 519-530.
Wang Na, Li Qiang-zi, Du Xin, et al. Identification of main crops based on the univariate feature selection in Subei[J]. Journal of Remote Sensing, 2017, 21(4): 519-530.
5 耿仁方,付波霖,蔡江涛,等. 基于无人机影像和面向对象随机森林算法的岩溶湿地植被识别方法研究[J]. 地球信息科学学报, 2019, 21(8): 1295-1306.
Geng Ren-fang, Fu Bo-lin, Cai Jiang-tao, et al. Object-based karst wetland vegetation classification method using unmanned aerial vehicle images and random forest algorithm[J]. Journal of Geo-Information Science, 2019, 21(8): 1295-1306.
6 董强,刘晶红,周前非. 用于遥感图像拼接的改进SURF算法[J]. 吉林大学学报: 工学版, 2017, 47(5): 1644-1652.
Dong Qiang, Liu Jin-hong, Zhou Qian-fei. Improved SURF algorithm used in image mosaic[J]. Journal of Jilin University (Engineering and Technology Edition), 2017, 47(5): 1644-1652.
7 刘凯,龚辉,曹晶晶,等. 基于多类型无人机数据的红树林遥感分类对比[J]. 热带地理, 2019, 39(4): 492-501.
Liu Kai, Gong Hui, Cao Jing-jing, et al. Comparison of mangrove remote sensing classification based on multi-type UAV data[J]. Tropical Geography, 2019, 39(4): 492-501.
8 陈媛媛,郑加柱,魏浩翰,等. 基于不同特征的随机森林极化SAR图像分类[J]. 计算机系统应用, 2019, 28(8): 183-189.
Chen Yuan-yuan, Zheng Jia-zhu, Wei Hao-han, et al. Tidal flat classification based on random forest model using different features of polarimetric SAR[J]. Computer Systems & Applications, 2019, 28(8): 183-189.
9 许章华,黄旭影,林璐,等. 基于Fisher判别分析与随机森林的马尾松毛虫害检测[J]. 光谱学与光谱分析, 2018, 38(9): 2888-2896.
Xu Zhang-hua, Huang Xu-ying, Lin Lu, et al. Dendrolimus punctatus walker damage detection based on fisher discriminant analysis and random forest[J]. Spectroscopy and Spectral Analysis, 2018, 38(9): 2888-2896.
10 陈妍,宋豫秦,王伟. 基于随机森林回归的草场植被盖度反演模型研究——以新疆阿勒泰地区布尔津县为例[J]. 生态学报, 2018, 38(7): 2384-2394.
Chen Yan, Song Yu-qin, Wang Wei. Grassland vegetation cover inversion model based on random forest regression: a case study in Burqin county, Altay, Xinjiang uygur autonomous region[J]. Acta Ecologica Sinica, 2018, 38(7): 2384-2394.
11 詹国旗,杨国东,王凤艳,等. 基于特征空间优化的随机森林算法在GF-2影像湿地分类中的研究[J]. 地球信息科学学报, 2018, 20(10): 1520-1528.
Zhang Guo-qi, Yang Guo-dong, Wang Feng-yan, et al. The random forest classification of wetland from GF-2 imagery based on the optimized feature space[J]. Journal of Geo-information Science, 2018, 20(10): 1520-1528.
12 Chen T, Trinder J, Niu R Q. Object-oriented landslide mapping using ZY-3 satellite imagery, random forest and mathematical morphology, for the Three-Gorges Reservoir, China[J]. Remote Sensing, 2017, 9: 333.
13 Zhu J, Pan Z W, Wang H, et al. An improved multi-temporal and multi-feature tea plantation identification method using Sentinel-2 imagery[J]. Sensors(Basel, Switzerland), 2019, 19(9): 2087.
14 马玥,姜琦刚,孟治国,等. 基于随机森林算法的农耕区土地利用分类研究[J]. 农业机械学报, 2016, 47(1): 297-303.
Ma Yue, Jiang Qi-gang, Meng Zhi-guo, et al. Classification of land use in farming area based on random forest algorithm[J]. Transactions of the Chinese Society for Agricultural Machinery, 2016, 47(1): 297-303.
15 Zhou F Q, Zhang A N. Optimal subset selection of time-Series MODIS images and sample data transfer with random forests for supervised classification modelling[J]. Sensors, 2016, 16: 1783.
16 Whiteside T, Bartolo R. Mapping aquatic vegetation in a tropical wetland using high spatial resolution multispectral satellite imagery[J]. Remote Sensing, 2015, 7: 11664-11694.
17 杨艳魁,陈芸芝,吴波,等. 基于高分二号影像结合纹理信息的茶园提取[J]. 江苏农业科学, 2019, 47(2): 210-214.
Yang Yan-kui, Chen Yun-zhi, Wu Bo, et al. Extraction of tea plantation image based on GF-2 image and texture information[J]. Jiangsu Agricultural Sciences, 2019, 47(2): 210-214.
18 Gao T, Zhu J J, Zheng X, et al. Mapping spatial distribution of larch plantations from multi-seasonal Landsat-8 OLI imagery and multi-scale textures using random forests[J]. Remote Sensing, 2015, 7: 1702-1720.
19 Wang B, Li J, Jin X F, et al. Mapping tea plantations from multi-seasonal Landsat-8 OLI imageries using a random forest classifier[J]. Journal of the Indian Society of Remote Sensing, 2019, 47: 1315-1329.
20 范敏,韩琪,王芬,等. 基于多层次特征表示的场景图像分类算法[J]. 吉林大学学报: 工学版, 2017, 47(6): 1909-1917.
Fan Min, Han Qi, Wang Fen, et al. Scene image categorization algorithm based on multi-level features representation[J]. Journal of Jilin University (Engineering and Technology Edition), 2017, 47(6): 1909-1917.
21 郑国忠.地方政府在安吉白茶产业集群发展中的作用研究[D]. 杭州: 浙江农林大学经济管理学院, 2018.
Zheng Guo-zhong. Study on the role of local government in the development of Anji white tea industry cluster[D]. Hangzhou: College of Economics and Management,Zhejiang Agriculture & Forestry University, 2018.
22 李光华,李俊清,张亮,等. 一种融合蚁群算法和随机森林的特征选择方法[J]. 计算机科学, 2019, 46(11): 212-215.
Li Guang-hua, Li Jun-qing, Zhang Liang, et al. Feature selection method based on ant colony optimization and random forest[J]. Computer Science, 2019, 46(11): 212-215.
23 李平湘,刘致曲,杨杰,等. 利用随机森林回归进行极化SAR土壤水分反演[J]. 武汉大学学报:信息科学版, 2019, 44(3): 405-412.
Li Ping-xiang, Liu Zhi-qu, Yang Jie, et al. Soil moisture retrieval of winter wheat fields based on random forest regression using quad-polarimetric SAR images[J]. Geomatics and Information Science of Wuhan University, 2019, 44(3): 405-412.
[1] 王生生,姜林延,杨永波. 基于最优传输特征选择的医学图像分割迁移学习[J]. 吉林大学学报(工学版), 2022, 52(7): 1626-1638.
[2] 刘铭,杨雨航,邹松霖,肖志成,张永刚. 增强边缘检测图像算法在多书识别中的应用[J]. 吉林大学学报(工学版), 2022, 52(4): 891-896.
[3] 周怡娜,董宏丽,张勇,路敬祎. 基于VMD去噪和散布熵的管道信号特征提取方法[J]. 吉林大学学报(工学版), 2022, 52(4): 959-969.
[4] 耿端阳,孙延成,牟孝栋,张国栋,姜慧新,朱俊科. 基于差速辊的青贮玉米籽粒破碎仿真试验及优化[J]. 吉林大学学报(工学版), 2022, 52(3): 693-702.
[5] 温昌凯,谢斌,宋正河,韩建刚,杨倩雯. 拖拉机耐久性加速结构试验设计方法[J]. 吉林大学学报(工学版), 2022, 52(3): 703-715.
[6] 陈晓雷,孙永峰,李策,林冬梅. 基于卷积神经网络和双向长短期记忆的稳定抗噪声滚动轴承故障诊断[J]. 吉林大学学报(工学版), 2022, 52(2): 296-309.
[7] 李国发,王彦博,何佳龙,王继利. 机电装备健康状态评估研究进展及发展趋势[J]. 吉林大学学报(工学版), 2022, 52(2): 267-279.
[8] 许鸿奎,姜彤彤,李鑫,姜斌祥,王永雷. 结合降噪自编码与极限学习机的LTE上行干扰分析[J]. 吉林大学学报(工学版), 2022, 52(1): 195-203.
[9] 刘桂霞,裴志尧,宋佳智. 基于深度学习的蛋白质⁃ATP结合位点预测[J]. 吉林大学学报(工学版), 2022, 52(1): 187-194.
[10] 耿端阳,牟孝栋,张国栋,王宗源,朱俊科,徐海刚. 小麦联合收获机清选机理分析与优化试验[J]. 吉林大学学报(工学版), 2022, 52(1): 219-230.
[11] 王国伟,朱庆辉,于海业,黄东岩. 基于数字化农机装备的青贮饲料可追溯系统[J]. 吉林大学学报(工学版), 2022, 52(1): 242-252.
[12] 梁方,王德成,尤泳,王光辉,王宇兵,张晓明,冯金奎. 草地切根施肥补播复式改良机设计与试验[J]. 吉林大学学报(工学版), 2022, 52(1): 231-241.
[13] 刘远红,郭攀攀,张彦生,李鑫. 基于黎曼流形的稀疏图保持投影的特征提取[J]. 吉林大学学报(工学版), 2021, 51(6): 2268-2279.
[14] 王新彦,江泉,吕峰,易政洋. 基于参数化模型的零转弯半径割草机侧翻稳定性[J]. 吉林大学学报(工学版), 2021, 51(5): 1908-1918.
[15] 李雄飞,吴佳婧,张小利,王泽宇,冯云丛. 基于相对总变差结构提取的遥感图像融合[J]. 吉林大学学报(工学版), 2021, 51(5): 1775-1784.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!