吉林大学学报(地球科学版) ›› 2025, Vol. 55 ›› Issue (5): 1629-1643.doi: 10.13278/j.cnki.jjuese.20230139

• 地质工程与环境工程 • 上一篇    下一篇

基于机器学习的富硒土壤预测模型的构建与比较——以江西省信丰县油山地区为例

杨兰1,王运1,邹勇军2,胡宝群1,李满根1,张安1,朱满怀1   

  1. 1.东华理工大学江西省数字国土重点实验室,南昌330013

    2.江西省地质勘查院地质环境监测所,南昌330001

  • 出版日期:2025-09-26 发布日期:2025-11-15
  • 基金资助:
    中国地质调查局项目(DD20160321);江西省重点研发计划项目(20203BBG72W011);东华理工大学博士启动基金项目(DHBK2019051);东华理工大学江西省数字国土重点实验室开放研究基金资助项目(DLLJ202205);江西省研究生创新专项资金项目(YC2022-S600)

Construction and Comparison of Models for Predicting Selenium Rich Soil Based on Machine Learning: A Case Study of Youshan Area,Xinfeng County, Jiangxi Province

Yang Lan1,Wang Yun1,Zou Yongjun2,Hu Baoqun1,Li Mangen1,Zhang An1,Zhu Manhuai1   

  1. 1. Key Laboratory of Digital Land and Resources of Jiangxi Province,East China University of Technology,

    Nanchang 330013,China

    2. Geological Environment Monitoring Institute of Jiangxi Geological Exploration Institute,Nanchang 330001,China

  • Online:2025-09-26 Published:2025-11-15
  • Supported by:
    Supported by the Project of China Geological Survey (DD20160321),the Key Research and Development Plan Project of Jiangxi Province (20203BBG72W011),the Doctoral Startup Fund of East China University of Technology (DHBK2019051), the Project of Key Laboratory for Digital Land and Resources of Jiangxi Province, East China University of Technology (DLLJ202205) and the Project of Jiangxi Postgraduate Innovation Special Fund (YC2022-S600)

摘要:

利用未知硒数据快速、高效、精准地圈定富硒土壤,需构建预测富硒土壤的最佳模型。从1 277个1∶5万表层土壤的地球化学数据中选取502个数据组成数据集,以w(Zn)、w(K2O)、w(P)、w(Mo)、w(Mn)、w(Cr)、pH、D(泥盆系)为自变量,以是否富Se为因变量,运用SPSS Modeler 18软件构建二元Logistic回归模型、多层感知器神经网络模型、随机森林模型及支持向量机模型(包括线性、多项式、径向基和Sigmoid核函数),并通过35组土壤样品实测数据进行验证。结果表明:二元Logistic回归模型、多层感知器神经网络模型、随机森林模型及(线性、多项式、径向基、Sigmoid)支持向量机模型的预测准确率和验证总体准确率分别为88.8%和94.3%、91.0%和97.1%、96.6%和97.1%、87.9%和97.1%、86.1%和94.3%、86.9%和94.3%、80.3%和91.4%;以上模型的曲线下面积(AUC)值分别为0.948、0.950、0.993、0.937、0.945、0.928和0.873,随机森林模型的准确率和稳定性最佳。同时,本次研究发现了清洁富硒土壤及绿色富硒山稻,表明该方法在富硒土壤预测中具有可行性,且可进一步拓展到地质找矿及环境监测等领域。

关键词: 富硒土壤, 机器学习, 二元Logistic回归模型, 多层感知器神经网络模型, 随机森林模型, 支持向量机模型

Abstract:

In order to find selenium rich soil quickly, efficiently and accurately using selenium free data, it is necessary to build the best model to predict selenium rich soil. 502 data sets were selected from 1 277 1∶50 000 surface soil geochemical data. With w(Zn),w(K2O),w(P),w(Mo),w(Mn),w(Cr),pH,D(Devonian) as independent variables and Se rich or not as dependent variables, SPSS Modeler 18 software was used to build binary Logistic regression model, multi-layer perceptron neural network model, random forest model and support vector machine model (linear, multinomial, radial basis function, Sigmoid) for predicting Se rich soil, and the measured data of 35 soil samples were used for verification. The results show that, using binary Logistic regression model, multilayer perceptron neural network model, random forest model and support vector machine model (linear, polynomial, radial basis function, Sigmoid), the overall accuracy of prediction and verification of the seven prediction models and were 88.8% and 94.3%, 91.0% and 97.1%, 96.6% and 97.1%, 87.9% and 97.1%, 86.1% and 94.3%, 86.9% and 94.3%, 80.3% and 91.4%. The AUC were 0.948, 0.950, 0.993, 0.937, 0.945, 0.928 and 0.873, respectively. The accuracy and stability of the random forest model are the best. Meanwhile, this study identified clean selenium-rich soil and green selenium-rich mountain rice, indicating that this method is feasible in the prediction of selenium-rich soil, and it can be further extended to geological prospecting and environmental monitoring.

Key words: selenium rich soil, machine learning, binary Logistic regression model, multilayer perceptron neural network model, random forest, support vector machine model

中图分类号: 

  • P59
[1] 曹志民, 张丽, 郑兵, 韩建. 基于SMOTE平衡数据的极端随机树岩性识别[J]. 吉林大学学报(地球科学版), 2025, 55(4): 1372-1386.
[2] 曹志民, 丁璐, 韩建, 郝乐川, .

基于集成机器学习的测井曲线大尺度差异超分辨 [J]. 吉林大学学报(地球科学版), 2025, 55(2): 670-685.

[3] 吕华星, 陈兆明, 张振波, 姜大朋, 李克成, 郭伟. 机器学习高分辨融合反演在地层对比中的应用——以珠江口盆地开平凹陷开平A构造带为例[J]. 吉林大学学报(地球科学版), 2025, 55(1): 289-297.
[4] 王新领, 祝新益, 张宏兵, 孙博, 许可欣.

基于随机树嵌入的随钻测井岩性识别方法 [J]. 吉林大学学报(地球科学版), 2024, 54(2): 701-708.

[5] 杨国华, 李婉露, 孟博. 基于机器学习方法的地下水氨氮时空分布规律[J]. 吉林大学学报(地球科学版), 2022, 52(6): 1982-1995.
[6] 侯贤沐, 王付勇, 宰芸, 廉培庆. 基于机器学习和测井数据的碳酸盐岩孔隙度与渗透率预测[J]. 吉林大学学报(地球科学版), 2022, 52(2): 644-653.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!