Journal of Jilin University(Engineering and Technology Edition) ›› 2024, Vol. 54 ›› Issue (4): 1086-1098.doi: 10.13229/j.cnki.jdxbgxb.20220692

Previous Articles    

Fusion multi-scale Transformer skin lesion segmentation algorithm

Li-ming LIANG1(),Long-song ZHOU1,Jiang YIN1,Xiao-qi SHENG2   

  1. 1.School of Electrical Engineering and Automation,Jiangxi University of Science and Technology,Ganzhou 341000,China
    2.School of Computer Science and Engineering,South China University of Technology,Guangzhou,510006,China
  • Received:2022-06-02 Online:2024-04-01 Published:2024-05-17

Abstract:

To address the problem of lack of multi-scale feature extraction in existing skin lesion image segmentation, which leads to lack of detailed information and incorrect segmentation of skin lesion regions, this paper proposes a fusion multi-scale Transformer encoder-decoder network skin lesion segmentation algorithm.First, a hierarchical encoder is constructed using Transformer Block, which analyses the skin lesion region from the perspective of global feature variation at multiple scales. Then, the multi-scale fusion module, channel attention module and concat layer are used to construct the fusion decoder. The multi-scale fusion module fuses shallow network information and deep network information in the hierarchical encoder to enhance the dependency between spatial and semantic information, and the channel attention module can effectively identify channels containing rich feature information and improve the segmentation accuracy of the algorithm. Finally, an expansion module is introduced to recover the image size to meet the practical requirements. The proposed algorithm was experimentally tested on three public datasets, ISBI2016, ISBI2017 and ISIC2018. The pixel accuracies were 96.70%, 94.50% and 95.39%, respectively, and the mean intersection over union were 91.69%, 85.74% and 89.29%, respectively, with the overall performance of the tested algorithms outperforming existing algorithms.Simulation experiments show that the multi-scale Transformer encoder-decoder network can effectively segment skin lesion images, providing a new window for the diagnosis of modern skin diseases.

Key words: computer application technology, skin lesions, image segmentation, transformer, multi-scale fusion module, channel attention module

CLC Number: 

  • TP183

Fig.1

Fusion multi-scale transformer encoder-decoder network"

Fig.2

Self-attention layer"

Fig.3

Transformer block structure"

Fig.4

Multi-scale fusion module"

Fig.5

Channel attention module"

Fig.6

Expansion module"

Fig.7

Image preprocessing operation"

Table 1

Comparison results of different networks on ISBI2016 and ISBI2017 datasets"

模 型ISBI 2016ISBI 2017Para/非国标单位Time/s
Acc/%mIoU/%Kappa/%Acc/%mIoU/%Kappa/%
U-Net93.3289.1187.3491.5778.3474.8713.40376
Attention U-Net95.3389.1988.4692.4580.3277.4934.89568
U2-Net95.5089.6489.9693.2082.2380.5144.05424
MAU-Net95.8390.2189.8293.5883.6181.7042.02492
SETR96.4191.2290.7294.3985.5384.07306.96952
SegFormer95.7389.6688.9693.6883.7181.8284.59728
MsF-SegFormer96.7091.6991.2994.5085.7484.3083.16752

Table 2

Comparison results of different networks on PH2 and ISIC2018 datasets"

模 型PH2ISIC2018Para/MTime/s
Acc/%Dice/%mIoU/%Kappa/%Acc/%Dice/%mIoU/%Kappa/%
U-Net93.0092.1787.3086.3393.1591.6286.4485.2513.40352
Attention U-Net93.1592.3087.5486.6193.2891.7586.6585.4934.89536
U2-Net93.7292.9086.8385.8094.5393.1187.2686.2144.05384
MAU-Net94.1893.4087.7086.8094.6293.2187.4586.4342.02456
SETR94.5994.4388.6487.6295.1393.8588.5487.70306.96809
SegFormer94.2193.2988.2587.2895.0893.8488.5187.6784.59680
MsF-SegFormer94.7294.0988.9188.2095.3994.2889.2988.5683.16637

Fig.8

Evaluate pixel accuracy variation curve"

Fig.9

Evaluate mean intersection over union variationcurve"

Fig.10

Different network segmentation results on ISBI2016 and ISBI2017 datasets"

Fig.11

Different network segmentation results on ISIC2018 and PH2 datasets"

Fig.12

Segmentation results of different networks"

Table 3

Ablation experiment"

模型EncoderDecodermIoU/%Dice/%Acc/%
MFMCLCAM
MsF-189.9594.9695.60
MsF-290.6995.0895.89
MsF-391.1695.3396.36
MsF-491.5395.5396.55
MsF-591.6995.6196.70

Table 4

Layered experiment"

模 型Acc/%Dice/%mIoU/%Kappa/%Para/MTime/s
MsF-Layer-396.0194.7890.1189.9543.72364
MsF-Layer-496.7095.6191.6991.2983.16391
MsF-Layer-596.7295.5291.2191.36221.51432
1 Jin Q, Cui H, Sun C, et al. Cascade knowledge diffusion network for skin lesion diagnosis and segmentation[J]. Applied Soft Computing, 2021, 99:No. 106881.
2 Sarker M M K, Rashwan H A, Akram F, et al. SLSNet: skin lesion segmentation using a lightweight generative adversarial network[J]. Expert Systems with Applications, 2021, 183: No.115433.
3 欧阳继红, 郭泽琪, 刘思光. 糖尿病视网膜病变分期双分支混合注意力决策网络[J]. 吉林大学学报: 工学版, 2022, 52(3): 648-656.
Ouyang Ji-hong, Guo Ze-qi, Liu Si-guang. Dual⁃branch hybrid attention decision net for diabetic retinopathy classification[J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 648-656.
4 王雪, 李占山, 吕颖达. 基于多尺度感知和语义适配的医学图像分割算法[J]. 吉林大学学报: 工学版, 2022, 52(3): 640-647.
Wang Xue, Li Zhan-shan, Ying-da Lyu. Medical image segmentation based on multi‍⁃‍scale context‍⁃aware and semantic adaptor[J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(3): 640-647.
5 王生生, 陈境宇, 卢奕南. 基于联邦学习和区块链的新冠肺炎胸部CT图像分割[J]. 吉林大学学报: 工学版, 2021, 51(6): 2164-2173.
Wang Sheng-sheng, Chen Jing-yu, Lu Yi-nan. COVID⁃19 chest CT image segmentation based on federated learning and blockchain[J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(6): 2164-2173.
6 张恒良, 李锵, 关欣. 一种改进的三维双路径脑肿瘤图像分割网络[J]. 光学学报, 2021, 41(3): 54-61.
Zhang Heng-liang, Li Qiang, Guan Xin. An improved three-dimensional dual-path brain tumor image segmentation network[J]. Acta Optica Sinica, 2021, 41(3): 54-61.
7 Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(4): 640-651.
8 Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation[C]∥International Conference on Medical Image Computing and Computer Assisted Intervention, Berlin,Germany, 2015: 234-241.
9 Sarker M, Kamal M, Rashwan H A, et al. SLSDeep: skin lesion segmentation based on dilated residual and pyramid pooling networks[C]∥International Conference on Medical Image Computing and Computer-Assisted Intervention, Granada, Spain, 2018: 21-29.
10 Gu R, Wang L, Zhang L. DE-Net: a deep edge network with boundary information for automatic skin lesion segmentation[J]. Neurocomputing, 2022, 468: 71-84.
11 Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J/OL]. [2020-09-23]..
12 Cao H, Wang Y, Chen J, et al. Swin-Unet: Unet-like pure Transformer for medical image segmentation[J/OL]. [2021-11-10]. .
13 Wang W, Xie E, Li X, et al. Pyramid vision Transformer: a Versatile backbone for dense prediction without convolutions[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 568-578.
14 Xie E, Wang W, Yu Z, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[J/OL]. [2021-09-16]..
15 Petit O, Thome N, Rambour C, et al. U-net Transformer: self and cross attention for medical image segmentation[C]∥International Workshop on Machine Learning in Medical Imaging, Nagoya, Japan, 2021: 267-276.
16 Islam M A, Jia S, Bruce N D B. How much position information do convolutional neural networks encode?[J/OL]. [2020-08-17]. .
17 Chu X, Tian Z, Zhang B, et al. Conditional positional encodings for vision transformers[J/OL]. [2021-07-26]. .
18 Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3146-3154.
19 Gutman D, Codella N C F, Celebi E, et al. Skin lesion analysis toward melanoma detection: a challenge at the international symposium on biomedical imaging (ISBI) 2016, hosted by the international skin imaging collaboration (ISIC)[J/OL].[2016-08-24]. .
20 Codella N C F, Gutman D, Celebi M E, et al. Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (isic)[C]∥IEEE 15th International Symposium on Biomedical Imaging(ISBI 2018), WashingtonDC, USA, 2018: 168-172.
21 Tschandl P, Rosendahl C, Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions[J]. Scientific Data, 2018, 5(1): 1-9.
22 Mendonça T, Ferreira P M, Marques J S, et al. PH 2-A dermoscopic image database for research and benchmarking[C]∥The 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society(EMBC), Osaka, Japan, 2013: 5437-5440.
23 Loshchilov I, Hutter F. Decoupled weight decay regularization[J/OL].[2017-06-24]. .
24 Oktay O, Schlemper J, Folgoc L L, et al. Attention U-net: learning where to look for the pancreas[J/OL]. [2018-10-22]. .
25 Qin X, Zhang Z, Huang C, et al. U2-Net: Going deeper with nested U-structure for salient object detection[J]. Pattern Recognition, 2020, 106: No.107404.
26 梁礼明, 尹江, 彭仁杰, 等. 基于多尺度注意力的皮肤镜图像自动分割算法[J]. 科学技术与工程, 2021,21(34): 14644-14650.
Liang Li-ming, Yin Jiang, Peng Ren-jie, et al. Automatic segmentation algorithm of dermoscopy images based on multi-scale attention[J]. Science Technology and Engineering, 2021, 21(34): 14644-14650.
27 Zheng S, Lu J, Zhao H, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashvill, USA, 2021: 6881-6890.
[1] De-xing WANG,Kai GAO,Hong-chun YUAN,Yu-rui YANG,Yue WANG,Ling-dong KONG. Underwater image enhancement based on color correction and TransFormer detail sharpening [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(3): 785-796.
[2] Guo-jin TAN,Ji OU,Yong-ming AI,Run-chao YANG. Bridge crack image segmentation method based on improved DeepLabv3+ model [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(1): 173-179.
[3] Ya-hui ZHAO,Fei-yu LI,Rong-yi CUI,Guo-zhe JIN,Zhen-guo ZHANG,De LI,Xiao-feng JIN. Korean⁃Chinese translation quality estimation based on cross⁃lingual pretraining model [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(8): 2371-2379.
[4] Jian LI,Qi XIONG,Ya-ting HU,Kong-yu LIU. Chinese named entity recognition method based on Transformer and hidden Markov model [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(5): 1427-1434.
[5] Ke HE,Hai-tao DING,Xuan-qi LAI,Nan XU,Kong-hui GUO. Wheel odometry error prediction model based on transformer [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 653-662.
[6] Shan XUE,Ya-liang ZHANG,Qiong-ying LYU,Guo-hua CAO. Anti⁃unmanned aerial vehicle system object detection algorithm under complex background [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 891-901.
[7] Zhen WANG,Xiao-han YANG,Nan-nan WU,Guo-kun LI,Chuang FENG. Ordinal cross entropy Hashing based on generative adversarial network [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(12): 3536-3546.
[8] Feng-feng ZHOU,Zhen-wei YAN. A model for identifying neuropeptides by feature selection based on hybrid features [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(11): 3238-3245.
[9] Bing ZHU,Zi-wei LI,Qi LI. Building segmentation method of remote sensing image based on improved SegNet [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(1): 248-254.
[10] Jun-jie WANG,Yuan-jun NONG,Li-te ZHANG,Pei-chen ZHAI. Visual relationship detection method based on construction scene [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(1): 226-233.
[11] Gui-he QIN,Jun-feng HUANG,Ming-hui SUN. Text input based on two⁃handed keyboard in virtual environment [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1881-1888.
[12] Fu-heng QU,Tian-yu DING,Yang LU,Yong YANG,Ya-ting HU. Fast image codeword search algorithm based on neighborhood similarity [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1865-1871.
[13] Tian BAI,Ming-wei XU,Si-ming LIU,Ji-an ZHANG,Zhe WANG. Dispute focus identification of pleading text based on deep neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1872-1880.
[14] Na LI,Shao-sheng TAN. Image segmentation of fencing continuous action based on spatial neighborhood information [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1639-1644.
[15] Sheng-sheng WANG,Lin-yan JIANG,Yong-bo YANG. Transfer learning of medical image segmentation based on optimal transport feature selection [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1626-1638.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!