Journal of Jilin University Science Edition ›› 2025, Vol. 63 ›› Issue (3): 815-0821.

Previous Articles     Next Articles

Construction of Multisource Agricultural Disease Image Dataset Based on Hierarchical Annotation and Adaptive Preprocessing

HU Ting1, SUN Xiaohai2, SONG Hailong2, LIAO Changyi2, WANG Fude2,3   

  1. 1. School of Journalism and Communication, Changchun University of Technology, Changchun 130012, China; 2. Jilin Haicheng Technology Co., Ltd., Changchun 130119, China; 3. College of Information Technology, Jilin Agricultural University, Changchun 130118, China
  • Received:2024-01-12 Online:2025-05-26 Published:2025-05-26

Abstract: Aming at the problems  of diversity and poor image quality in agricultural disease image datasets, we  proposed a multisource agricultural disease image dataset construction method based on hierarchical annotation and adaptive preprocessing. Firstly, images were collected from different regions, crop types, and growth stages by using devices such as smartphones, professional cameras, and drones to ensure data diversity. Secondly, we constructed a hierarchical annotation system that  covered three levels of agricultural disease type, severity, and location, we used tools such as LabelImg and LabelMe for annotation, and requested expert review. Finally, we applied adaptive preprocessing methods, including automatic cropping, normalization, denoising and enhancement, to adjust parameters based on image features to improve quality. The experiment used a convolutional neural network (CNN) model based on the ResNet-50 architecture for validation, and the results show that hierarchical annotation and adaptive preprocessing methods significantly improve the quality of the dataset and model performance, the model achieves accuracy, recall, and F1 score of 92.5%, 91.8%, and 92.1%, respectively, which are better than the training results of other datasets.

Key words: agricultural disease image, dataset construction, hierarchical annotation, adaptive preprocessing, multisource data

CLC Number: 

  • TP391