吉林大学学报(工学版) ›› 2023, Vol. 53 ›› Issue (4): 1146-1154.doi: 10.13229/j.cnki.jdxbgxb.20210829
• 计算机科学与技术 • 上一篇
姜宇1,2(),潘家铮1,陈何淮1,符凌智1,齐红1,2()
Yu JIANG1,2(),Jia-zheng PAN1,He-huai CHEN1,Ling-zhi FU1,Hong QI1,2()
摘要:
现阶段文本检测的研究主要面向自然场景数据集进行,针对繁体中文图像内嵌文本场景难以有效检测的问题,本文提出了一个基于分割方法的繁体中文报纸图像文本检测模型。该模型使用Resnet50和FPN作为特征提取网络,采用分割实例缩放加扩展算法的方法生成用于预测文本框的二值图,并提出了周围填补、循环检测加区域覆盖的方法增强检测效果。该模型针对自建繁体中文报纸数据集的实验结果的三项指标均在0.9左右,且相比于目前文本检测效果较好的DBNet的实验结果均提升了5%~7%,针对繁体中文报纸图像文本检测任务具有一定的优越性。
中图分类号:
1 | Tian Z, Huang W L, He T, et al. Detecting text in natural image with connectionist text proposal network[C]∥European Conference on Computer Vision, Springer, Cham, 2016: 56-72. |
2 | Shi B G, Bai X, Belongie S. Detecting oriented text in natural images by linking segments[C]∥Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, USA, 2017: 2550-2558. |
3 | Zhou X Y, Yao C, Wen H, et al. East: an efficient and accurate scene text detector[C]∥Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5551-5560. |
4 | Wang X X, Jiang Y Y, Luo Z B, et al. Arbitrary shape scene text detection with adaptive text region representation[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 6449-6458. |
5 | Zhu Y Q, Chen J L, Liang L Y, et al. Fourier contour embedding for arbitrary-shaped text detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 3123-3131. |
6 | Liu Y L, He T, Chen H, et al. Exploring the capacity of sequential-free box discretization network for omnidirectional scene text detection[J/OL].[2019-03-12]. arXiv preprint arXiv:. |
7 | Liu Y L, Chen H, Shen C H, et al. Abcnet: real-time scene text spotting with adaptive bezier-curve network[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 9809-9818. |
8 | Liao M H, Shi B G, Bai X, et al. Textboxes: a fast text detector with a single deep neural network[C]∥Thirty-first AAAI conference on artificial intelligence, San Francisco,USA, 2017: 4164-4167. |
9 | Liao M H, Shi B G, Bai X. Textboxes++: a single-shot oriented scene text detector[J]. IEEE Transactions on Image Processing, 2018, 27(8): 3676-3690. |
10 | Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 580-587. |
11 | Gkioxari G, Girshick R, Malik J. Contextual action recognition with R*CNN[C]∥Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1080-1088. |
12 | Ren S Q, He K M, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149. |
13 | Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]∥European conference on computer vision, Springer, Cham, 2016: 21-37. |
14 | Wang W H, Xie E Z, Li X, et al. Shape robust text detection with progressive scale expansion network[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 9336-345. |
15 | Tian Z T, Shu M, Lyu P, et al. Learning shape-aware embedding for scene text detection[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 4234-4243. |
16 | Baek Y, Lee B, Han D, et al. Character region awareness for text detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 9365-9374. |
17 | Wang W H, Xie E Z, Song X G, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, South Korea, 2019: 8440-8449. |
18 | Lv P, Liao M H, Yao C, et al. Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes[C]∥Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 67-83. |
19 | Liao M H, Wan Z Y, Yao C, et al. Real-time scene text detection with differentiable binarization[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11474-11481. |
20 | Zhu Y X, Du J. Textmountain: accurate scene text detection via instance segmentation[J]. Pattern Recognition, 2021, 110: No. 107336. |
21 | Vatti B R. A generic solution to polygon clipping[J]. Communications of the ACM, 1992, 35(7): 56-63. |
22 | Nayef N, Yin F, Bizid I, et al. Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT[C]∥The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 2017: 1454-1459. |
23 | Karatzas D, Gomez-Bigorda L, Nicolaou A, et al. ICDAR 2015 competition on robust reading[C]//The 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 2015: 1156-1160. |
24 | Matas J, Chum O, Urban M, et al. Robust wide-baseline stereo from maximally stable extremal regions[J]. Image and vision computing, 2004, 22(10): 761-767. |
25 | Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, 2010: 2963-2970. |
[1] | 薛珊,张亚亮,吕琼莹,曹国华. 复杂背景下的反无人机系统目标检测算法[J]. 吉林大学学报(工学版), 2023, 53(3): 891-901. |
[2] | 吴振宇,刘小飞,王义普. 基于DKRRT*-APF算法的无人系统轨迹规划[J]. 吉林大学学报(工学版), 2023, 53(3): 781-791. |
[3] | 陶博,颜伏伍,尹智帅,武冬梅. 基于高精度地图增强的三维目标检测算法[J]. 吉林大学学报(工学版), 2023, 53(3): 802-809. |
[4] | 潘弘洋,刘昭,杨波,孙庚,刘衍珩. 基于新一代通信技术的无人机系统群体智能方法综述[J]. 吉林大学学报(工学版), 2023, 53(3): 629-642. |
[5] | 何颖,樊俊松,王巍,孙庚,刘衍珩. 无人机空地安全通信与航迹规划的多目标联合优化方法[J]. 吉林大学学报(工学版), 2023, 53(3): 913-922. |
[6] | 郭鹏,赵文超,雷坤. 基于改进Jaya算法的双资源约束柔性作业车间调度[J]. 吉林大学学报(工学版), 2023, 53(2): 480-487. |
[7] | 刘近贞,高国辉,熊慧. 用于脑组织分割的多尺度注意网络[J]. 吉林大学学报(工学版), 2023, 53(2): 576-583. |
[8] | 时小虎,吴佳琦,吴春国,程石,翁小辉,常志勇. 基于残差网络的弯道增强车道线检测方法[J]. 吉林大学学报(工学版), 2023, 53(2): 584-592. |
[9] | 祁贤雨,王巍,王琳,赵玉飞,董彦鹏. 基于物体语义栅格地图的语义拓扑地图构建方法[J]. 吉林大学学报(工学版), 2023, 53(2): 569-575. |
[10] | 周丰丰,朱海洋. 基于三段式特征选择策略的脑电情感识别算法SEE[J]. 吉林大学学报(工学版), 2022, 52(8): 1834-1841. |
[11] | 曲福恒,丁天雨,陆洋,杨勇,胡雅婷. 基于邻域相似性的图像码字快速搜索算法[J]. 吉林大学学报(工学版), 2022, 52(8): 1865-1871. |
[12] | 白天,徐明蔚,刘思铭,张佶安,王喆. 基于深度神经网络的诉辩文本争议焦点识别[J]. 吉林大学学报(工学版), 2022, 52(8): 1872-1880. |
[13] | 赵宏伟,张健荣,朱隽平,李海. 基于对比自监督学习的图像分类框架[J]. 吉林大学学报(工学版), 2022, 52(8): 1850-1856. |
[14] | 秦贵和,黄俊锋,孙铭会. 基于双手键盘的虚拟现实文本输入[J]. 吉林大学学报(工学版), 2022, 52(8): 1881-1888. |
[15] | 胡丹,孟新. 基于时变网格的对地观测卫星搜索海上船舶方法[J]. 吉林大学学报(工学版), 2022, 52(8): 1896-1903. |
|