吉林大学学报(理学版) ›› 2025, Vol. 63 ›› Issue (3): 815-0821.

• • 上一篇    下一篇

基于层次标注和自适应预处理的多源农业病害图像数据集构建

胡婷1, 孙晓海2, 宋海龙2, 廖昌义2, 王福德2,3   

  1. 1. 长春工业大学 新闻与传播学院, 长春 130012;2. 吉林海诚科技有限公司, 长春 130119; 3. 吉林农业大学 信息技术学院, 长春 130118
  • 收稿日期:2024-01-12 出版日期:2025-05-26 发布日期:2025-05-26
  • 通讯作者: 王福德 E-mail:562324919@qq.com

Construction of Multisource Agricultural Disease Image Dataset Based on Hierarchical Annotation and Adaptive Preprocessing

HU Ting1, SUN Xiaohai2, SONG Hailong2, LIAO Changyi2, WANG Fude2,3   

  1. 1. School of Journalism and Communication, Changchun University of Technology, Changchun 130012, China; 2. Jilin Haicheng Technology Co., Ltd., Changchun 130119, China; 3. College of Information Technology, Jilin Agricultural University, Changchun 130118, China
  • Received:2024-01-12 Online:2025-05-26 Published:2025-05-26

摘要: 针对农业病害图像数据集存在多样性和图像质量欠佳的问题, 提出一种基于层次标注和自适应预处理的多源农业病害图像数据集构建方法. 首先, 利用智能手机、 专业相机和无人机等设备从不同地区、 作物种类及生长阶段采集图像, 以确保数据的多样性. 其次,  构建层次标注体系, 涵盖农业病害类型、 程度和部位3个层次, 使用LabelImg和LabelMe等工具进行标注, 并经专家审核. 最后, 应用自适应预处理方法, 包括自动裁剪、 归一化、 去噪和增强, 根据图像特征调整参数以提升质量. 实验采用基于ResNet-50架构的卷积神经网络(CNN)模型进行验证, 结果表明, 层次标注和自适应预处理方法显著提升了数据集的质量和模型性能, 模型在准确率、 召回率和F1分数上分别达92.5%,91.8%和92.1%, 优于其他数据集训练结果.

关键词: 农业病害图像, 数据集构建, 层次标注, 自适应预处理, 多源数据

Abstract: Aming at the problems  of diversity and poor image quality in agricultural disease image datasets, we  proposed a multisource agricultural disease image dataset construction method based on hierarchical annotation and adaptive preprocessing. Firstly, images were collected from different regions, crop types, and growth stages by using devices such as smartphones, professional cameras, and drones to ensure data diversity. Secondly, we constructed a hierarchical annotation system that  covered three levels of agricultural disease type, severity, and location, we used tools such as LabelImg and LabelMe for annotation, and requested expert review. Finally, we applied adaptive preprocessing methods, including automatic cropping, normalization, denoising and enhancement, to adjust parameters based on image features to improve quality. The experiment used a convolutional neural network (CNN) model based on the ResNet-50 architecture for validation, and the results show that hierarchical annotation and adaptive preprocessing methods significantly improve the quality of the dataset and model performance, the model achieves accuracy, recall, and F1 score of 92.5%, 91.8%, and 92.1%, respectively, which are better than the training results of other datasets.

Key words: agricultural disease image, dataset construction, hierarchical annotation, adaptive preprocessing, multisource data

中图分类号: 

  • TP391