融合多尺度Transformer的皮肤病变分割算法

doi:10.13229/j.cnki.jdxbgxb.20220692

Abstract

Abstract:

To address the problem of lack of multi-scale feature extraction in existing skin lesion image segmentation， which leads to lack of detailed information and incorrect segmentation of skin lesion regions， this paper proposes a fusion multi-scale Transformer encoder-decoder network skin lesion segmentation algorithm.First， a hierarchical encoder is constructed using Transformer Block， which analyses the skin lesion region from the perspective of global feature variation at multiple scales. Then， the multi-scale fusion module， channel attention module and concat layer are used to construct the fusion decoder. The multi-scale fusion module fuses shallow network information and deep network information in the hierarchical encoder to enhance the dependency between spatial and semantic information， and the channel attention module can effectively identify channels containing rich feature information and improve the segmentation accuracy of the algorithm. Finally， an expansion module is introduced to recover the image size to meet the practical requirements. The proposed algorithm was experimentally tested on three public datasets， ISBI2016， ISBI2017 and ISIC2018. The pixel accuracies were 96.70%， 94.50% and 95.39%， respectively， and the mean intersection over union were 91.69%， 85.74% and 89.29%， respectively， with the overall performance of the tested algorithms outperforming existing algorithms.Simulation experiments show that the multi-scale Transformer encoder-decoder network can effectively segment skin lesion images， providing a new window for the diagnosis of modern skin diseases.

Key words: computer application technology, skin lesions, image segmentation, transformer, multi-scale fusion module, channel attention module

CLC Number:

TP183

Li-ming LIANG,Long-song ZHOU,Jiang YIN,Xiao-qi SHENG. Fusion multi-scale Transformer skin lesion segmentation algorithm[J].Journal of Jilin University(Engineering and Technology Edition), 2024, 54(4): 1086-1098.

Figures/Tables 16

Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

Fig.6

Fig.7

Table 1

Table 2

Fig.8

Fig.9

Fig.10

Fig.11

Fig.12

Table 3

Table 4

References 27

1	Jin Q, Cui H, Sun C, et al. Cascade knowledge diffusion network for skin lesion diagnosis and segmentation[J]. Applied Soft Computing, 2021, 99:No. 106881.
2	Sarker M M K, Rashwan H A, Akram F, et al. SLSNet: skin lesion segmentation using a lightweight generative adversarial network[J]. Expert Systems with Applications, 2021, 183: No.115433.
3	欧阳继红, 郭泽琪, 刘思光. 糖尿病视网膜病变分期双分支混合注意力决策网络[J]. 吉林大学学报: 工学版, 2022, 52(3): 648-656.
	Ouyang Ji-hong, Guo Ze-qi, Liu Si-guang. Dual⁃branch hybrid attention decision net for diabetic retinopathy classification[J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 648-656.
4	王雪, 李占山, 吕颖达. 基于多尺度感知和语义适配的医学图像分割算法[J]. 吉林大学学报: 工学版, 2022, 52(3): 640-647.
	Wang Xue, Li Zhan-shan, Ying-da Lyu. Medical image segmentation based on multi‍⁃‍scale context‍⁃aware and semantic adaptor[J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 640-647.
5	王生生, 陈境宇, 卢奕南. 基于联邦学习和区块链的新冠肺炎胸部CT图像分割[J]. 吉林大学学报: 工学版, 2021, 51(6): 2164-2173.
	Wang Sheng-sheng, Chen Jing-yu, Lu Yi-nan. COVID⁃19 chest CT image segmentation based on federated learning and blockchain[J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(6): 2164-2173.
6	张恒良, 李锵, 关欣. 一种改进的三维双路径脑肿瘤图像分割网络[J]. 光学学报, 2021, 41(3): 54-61.
	Zhang Heng-liang, Li Qiang, Guan Xin. An improved three-dimensional dual-path brain tumor image segmentation network[J]. Acta Optica Sinica, 2021, 41(3): 54-61.
7	Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(4): 640-651.
8	Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation[C]∥International Conference on Medical Image Computing and Computer Assisted Intervention, Berlin,Germany, 2015: 234-241.
9	Sarker M, Kamal M, Rashwan H A, et al. SLSDeep: skin lesion segmentation based on dilated residual and pyramid pooling networks[C]∥International Conference on Medical Image Computing and Computer-Assisted Intervention, Granada, Spain, 2018: 21-29.
10	Gu R, Wang L, Zhang L. DE-Net: a deep edge network with boundary information for automatic skin lesion segmentation[J]. Neurocomputing, 2022, 468: 71-84.
11	Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J/OL]. [2020-09-23]..
12	Cao H, Wang Y, Chen J, et al. Swin-Unet: Unet-like pure Transformer for medical image segmentation[J/OL]. [2021-11-10]. .
13	Wang W, Xie E, Li X, et al. Pyramid vision Transformer: a Versatile backbone for dense prediction without convolutions[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 568-578.
14	Xie E, Wang W, Yu Z, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[J/OL]. [2021-09-16]..
15	Petit O, Thome N, Rambour C, et al. U-net Transformer: self and cross attention for medical image segmentation[C]∥International Workshop on Machine Learning in Medical Imaging, Nagoya, Japan, 2021: 267-276.
16	Islam M A, Jia S, Bruce N D B. How much position information do convolutional neural networks encode?[J/OL]. [2020-08-17]. .
17	Chu X, Tian Z, Zhang B, et al. Conditional positional encodings for vision transformers[J/OL]. [2021-07-26]. .
18	Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3146-3154.
19	Gutman D, Codella N C F, Celebi E, et al. Skin lesion analysis toward melanoma detection: a challenge at the international symposium on biomedical imaging (ISBI) 2016, hosted by the international skin imaging collaboration (ISIC)[J/OL].[2016-08-24]. .
20	Codella N C F, Gutman D, Celebi M E, et al. Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (isic)[C]∥IEEE 15th International Symposium on Biomedical Imaging(ISBI 2018), WashingtonDC, USA, 2018: 168-172.
21	Tschandl P, Rosendahl C, Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions[J]. Scientific Data, 2018, 5(1): 1-9.
22	Mendonça T, Ferreira P M, Marques J S, et al. PH 2-A dermoscopic image database for research and benchmarking[C]∥The 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society(EMBC), Osaka, Japan, 2013: 5437-5440.
23	Loshchilov I, Hutter F. Decoupled weight decay regularization[J/OL].[2017-06-24]. .
24	Oktay O, Schlemper J, Folgoc L L, et al. Attention U-net: learning where to look for the pancreas[J/OL]. [2018-10-22]. .
25	Qin X, Zhang Z, Huang C, et al. U²-Net: Going deeper with nested U-structure for salient object detection[J]. Pattern Recognition, 2020, 106: No.107404.
26	梁礼明, 尹江, 彭仁杰, 等. 基于多尺度注意力的皮肤镜图像自动分割算法[J]. 科学技术与工程, 2021,21(34): 14644-14650.
	Liang Li-ming, Yin Jiang, Peng Ren-jie, et al. Automatic segmentation algorithm of dermoscopy images based on multi-scale attention[J]. Science Technology and Engineering, 2021, 21(34): 14644-14650.
27	Zheng S, Lu J, Zhao H, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashvill, USA, 2021: 6881-6890.

Related Articles 15

[1]	De-xing WANG,Kai GAO,Hong-chun YUAN,Yu-rui YANG,Yue WANG,Ling-dong KONG. Underwater image enhancement based on color correction and TransFormer detail sharpening [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(3): 785-796.
[2]	Guo-jin TAN,Ji OU,Yong-ming AI,Run-chao YANG. Bridge crack image segmentation method based on improved DeepLabv3+ model [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(1): 173-179.
[3]	Ya-hui ZHAO,Fei-yu LI,Rong-yi CUI,Guo-zhe JIN,Zhen-guo ZHANG,De LI,Xiao-feng JIN. Korean⁃Chinese translation quality estimation based on cross⁃lingual pretraining model [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(8): 2371-2379.
[4]	Jian LI,Qi XIONG,Ya-ting HU,Kong-yu LIU. Chinese named entity recognition method based on Transformer and hidden Markov model [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(5): 1427-1434.
[5]	Ke HE,Hai-tao DING,Xuan-qi LAI,Nan XU,Kong-hui GUO. Wheel odometry error prediction model based on transformer [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 653-662.
[6]	Shan XUE,Ya-liang ZHANG,Qiong-ying LYU,Guo-hua CAO. Anti⁃unmanned aerial vehicle system object detection algorithm under complex background [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 891-901.
[7]	Zhen WANG,Xiao-han YANG,Nan-nan WU,Guo-kun LI,Chuang FENG. Ordinal cross entropy Hashing based on generative adversarial network [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(12): 3536-3546.
[8]	Feng-feng ZHOU,Zhen-wei YAN. A model for identifying neuropeptides by feature selection based on hybrid features [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(11): 3238-3245.
[9]	Bing ZHU,Zi-wei LI,Qi LI. Building segmentation method of remote sensing image based on improved SegNet [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(1): 248-254.
[10]	Jun-jie WANG,Yuan-jun NONG,Li-te ZHANG,Pei-chen ZHAI. Visual relationship detection method based on construction scene [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(1): 226-233.
[11]	Gui-he QIN,Jun-feng HUANG,Ming-hui SUN. Text input based on two⁃handed keyboard in virtual environment [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1881-1888.
[12]	Fu-heng QU,Tian-yu DING,Yang LU,Yong YANG,Ya-ting HU. Fast image codeword search algorithm based on neighborhood similarity [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1865-1871.
[13]	Tian BAI,Ming-wei XU,Si-ming LIU,Ji-an ZHANG,Zhe WANG. Dispute focus identification of pleading text based on deep neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1872-1880.
[14]	Na LI,Shao-sheng TAN. Image segmentation of fencing continuous action based on spatial neighborhood information [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1639-1644.
[15]	Sheng-sheng WANG,Lin-yan JIANG,Yong-bo YANG. Transfer learning of medical image segmentation based on optimal transport feature selection [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1626-1638.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

模型	ISBI 2016			ISBI 2017			Para/非国标单位	Time/s
模型	Acc/%	mIoU/%	Kappa/%	Acc/%	mIoU/%	Kappa/%	Para/非国标单位	Time/s
U-Net	93.32	89.11	87.34	91.57	78.34	74.87	13.40	376
Attention U-Net	95.33	89.19	88.46	92.45	80.32	77.49	34.89	568
U²-Net	95.50	89.64	89.96	93.20	82.23	80.51	44.05	424
MAU-Net	95.83	90.21	89.82	93.58	83.61	81.70	42.02	492
SETR	96.41	91.22	90.72	94.39	85.53	84.07	306.96	952
SegFormer	95.73	89.66	88.96	93.68	83.71	81.82	84.59	728
MsF-SegFormer	96.70	91.69	91.29	94.50	85.74	84.30	83.16	752

模型	PH2				ISIC2018				Para/M	Time/s
模型	Acc/%	Dice/%	mIoU/%	Kappa/%	Acc/%	Dice/%	mIoU/%	Kappa/%	Para/M	Time/s
U-Net	93.00	92.17	87.30	86.33	93.15	91.62	86.44	85.25	13.40	352
Attention U-Net	93.15	92.30	87.54	86.61	93.28	91.75	86.65	85.49	34.89	536
U²-Net	93.72	92.90	86.83	85.80	94.53	93.11	87.26	86.21	44.05	384
MAU-Net	94.18	93.40	87.70	86.80	94.62	93.21	87.45	86.43	42.02	456
SETR	94.59	94.43	88.64	87.62	95.13	93.85	88.54	87.70	306.96	809
SegFormer	94.21	93.29	88.25	87.28	95.08	93.84	88.51	87.67	84.59	680
MsF-SegFormer	94.72	94.09	88.91	88.20	95.39	94.28	89.29	88.56	83.16	637

模型	Encoder	Decoder			mIoU/%	Dice/%	Acc/%
模型	Encoder	MFM	CL	CAM	mIoU/%	Dice/%	Acc/%
MsF-1	√		√		89.95	94.96	95.60
MsF-2	√	√			90.69	95.08	95.89
MsF-3	√	√	√		91.16	95.33	96.36
MsF-4	√	√		√	91.53	95.53	96.55
MsF-5	√	√	√	√	91.69	95.61	96.70

模型	Acc/%	Dice/%	mIoU/%	Kappa/%	Para/M	Time/s
MsF-Layer-3	96.01	94.78	90.11	89.95	43.72	364
MsF-Layer-4	96.70	95.61	91.69	91.29	83.16	391
MsF-Layer-5	96.72	95.52	91.21	91.36	221.51	432

Fusion multi-scale Transformer skin lesion segmentation algorithm

RICH HTML

PDF (PC)