Journal of Jilin University(Engineering and Technology Edition) ›› 2023, Vol. 53 ›› Issue (4): 1146-1154.doi: 10.13229/j.cnki.jdxbgxb.20210829

Previous Articles    

Segmentation-based detector for traditional Chinese newspaper

Yu JIANG1,2(),Jia-zheng PAN1,He-huai CHEN1,Ling-zhi FU1,Hong QI1,2()   

  1. 1.College of Computer Science and Technology,Jilin University,Changchun 130012,China
    2.Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,Jilin University,Changchun 130012,China
  • Received:2021-08-27 Online:2023-04-01 Published:2023-04-20
  • Contact: Hong QI E-mail:jiangyu2011@jlu.edu.cn;qihong@jlu.edu.cn

Abstract:

Most of the research on text detection has been conducted on natural scene datasets, few on such specific scene. And the existing models are not good enough for text detection on traditional Chinese newspaper. In order to solve this problem, a segmentation-based text detector for traditional Chinese newspaper is proposed in this paper. The model uses Resnet50 and FPN as feature extraction network, employing a segmentation instance scaling and extension algorithm to generate the binary map for predicting text boxes. And the methods of surrounding filling and loop detection plus region coverage are proposed to enhance the detection effect. In addition, a traditional Chinese newspaper dataset is built to satisfy the research needs. The experimental results of this model on traditional Chinese newspaper dataset are around 0.9 and are improved by 5% to 7% compared with DBNet, which indicates that the model is effective and accurate for text detection on traditional Chinese newspaper.

Key words: computer application, specific scene, traditional Chinese newspaper, text embedded in images, text detection

CLC Number: 

  • TP391

Fig.1

Front architecture of model"

Fig.2

Image of two binarization formulas"

Fig.3

Average value of indicators under differentthresholds"

Fig.4

Example of test results"

Fig.5

Segmentation examples with different sizes"

Fig.6

Process of generating text boxes"

Fig.7

Image after covering part of text area"

Fig.8

Process of loop detection and region coverage"

Table 1

Performance with/without surrounding filling method"

模型RecallPrecisionF?score

不使用周围填补

使用周围填补

0.864

0.899

0.855

0.912

0.859

0.906

Table 2

Performance with/without segmentation instance scaling and extension algorithm"

模型RecallPrecisionF?score

不使用该方法

使用该方法

0.870

0.899

0.902

0.912

0.886

0.906

Table 3

Evaluation results on traditional newspapertext detection dataset"

模型RecallPrecisionF?score

MSER+SWT

CRAFT

FCENet

DBNet+MobileNetv3

DBNet+Resnet18

DBNet+Resnet50

本文

0.558

0.581

0.654

0.900

0.873

0.840

0.899

0.719

0.343

0.656

0.730

0.737

0.836

0.912

0.628

0.431

0.655

0.805

0.799

0.838

0.906

1 Tian Z, Huang W L, He T, et al. Detecting text in natural image with connectionist text proposal network[C]∥European Conference on Computer Vision, Springer, Cham, 2016: 56-72.
2 Shi B G, Bai X, Belongie S. Detecting oriented text in natural images by linking segments[C]∥Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, USA, 2017: 2550-2558.
3 Zhou X Y, Yao C, Wen H, et al. East: an efficient and accurate scene text detector[C]∥Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5551-5560.
4 Wang X X, Jiang Y Y, Luo Z B, et al. Arbitrary shape scene text detection with adaptive text region representation[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 6449-6458.
5 Zhu Y Q, Chen J L, Liang L Y, et al. Fourier contour embedding for arbitrary-shaped text detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 3123-3131.
6 Liu Y L, He T, Chen H, et al. Exploring the capacity of sequential-free box discretization network for omnidirectional scene text detection[J/OL].[2019-03-12]. arXiv preprint arXiv:.
7 Liu Y L, Chen H, Shen C H, et al. Abcnet: real-time scene text spotting with adaptive bezier-curve network[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 9809-9818.
8 Liao M H, Shi B G, Bai X, et al. Textboxes: a fast text detector with a single deep neural network[C]∥Thirty-first AAAI conference on artificial intelligence, San Francisco,USA, 2017: 4164-4167.
9 Liao M H, Shi B G, Bai X. Textboxes++: a single-shot oriented scene text detector[J]. IEEE Transactions on Image Processing, 2018, 27(8): 3676-3690.
10 Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 580-587.
11 Gkioxari G, Girshick R, Malik J. Contextual action recognition with R*CNN[C]∥Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1080-1088.
12 Ren S Q, He K M, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
13 Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]∥European conference on computer vision, Springer, Cham, 2016: 21-37.
14 Wang W H, Xie E Z, Li X, et al. Shape robust text detection with progressive scale expansion network[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 9336-345.
15 Tian Z T, Shu M, Lyu P, et al. Learning shape-aware embedding for scene text detection[C]∥IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 4234-4243.
16 Baek Y, Lee B, Han D, et al. Character region awareness for text detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 9365-9374.
17 Wang W H, Xie E Z, Song X G, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, South Korea, 2019: 8440-8449.
18 Lv P, Liao M H, Yao C, et al. Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes[C]∥Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 67-83.
19 Liao M H, Wan Z Y, Yao C, et al. Real-time scene text detection with differentiable binarization[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11474-11481.
20 Zhu Y X, Du J. Textmountain: accurate scene text detection via instance segmentation[J]. Pattern Recognition, 2021, 110: No. 107336.
21 Vatti B R. A generic solution to polygon clipping[J]. Communications of the ACM, 1992, 35(7): 56-63.
22 Nayef N, Yin F, Bizid I, et al. Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT[C]∥The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 2017: 1454-1459.
23 Karatzas D, Gomez-Bigorda L, Nicolaou A, et al. ICDAR 2015 competition on robust reading[C]//The 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 2015: 1156-1160.
24 Matas J, Chum O, Urban M, et al. Robust wide-baseline stereo from maximally stable extremal regions[J]. Image and vision computing, 2004, 22(10): 761-767.
25 Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, 2010: 2963-2970.
[1] Shan XUE,Ya-liang ZHANG,Qiong-ying LYU,Guo-hua CAO. Anti⁃unmanned aerial vehicle system object detection algorithm under complex background [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 891-901.
[2] Zhen-yu WU,Xiao-fei LIU,Yi-pu WANG. Trajectory planning of unmanned system based on DKRRT*⁃APF algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 781-791.
[3] Bo TAO,Fu-wu YAN,Zhi-shuai YIN,Dong-mei WU. 3D object detection based on high⁃precision map enhancement [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 802-809.
[4] Hong-yang PAN,Zhao LIU,Bo YANG,Geng SUN,Yan-heng LIU. Overview of swarm intelligence methods for unmanned aerial vehicle systems based on new⁃generation information technology [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 629-642.
[5] Ying HE,Jun-song FAN,Wei WANG,Geng SUN,Yan-heng LIU. Joint optimization of secure communication and trajectory planning in unmanned aerial vehicle air⁃to⁃ground [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 913-922.
[6] Peng GUO,Wen-chao ZHAO,Kun LEI. Dual⁃resource constrained flexible job shop optimal scheduling based on an improved Jaya algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(2): 480-487.
[7] Jin-Zhen Liu,Guo-Hui Gao,Hui Xiong. Multi⁃scale attention network for brain tissue segmentation [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(2): 576-583.
[8] Xiao-hu SHI,Jia-qi WU,Chun-guo WU,Shi CHENG,Xiao-hui WENG,Zhi-yong CHANG. Residual network based curve enhanced lane detection method [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(2): 584-592.
[9] Xian-yu QI,Wei WANG,Lin WANG,Yu-fei ZHAO,Yan-peng DONG. Semantic topological map building with object semantic grid map [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(2): 569-575.
[10] Feng-feng ZHOU,Hai-yang ZHU. SEE: sense EEG⁃based emotion algorithm via three⁃step feature selection strategy [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1834-1841.
[11] Fu-heng QU,Tian-yu DING,Yang LU,Yong YANG,Ya-ting HU. Fast image codeword search algorithm based on neighborhood similarity [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1865-1871.
[12] Tian BAI,Ming-wei XU,Si-ming LIU,Ji-an ZHANG,Zhe WANG. Dispute focus identification of pleading text based on deep neural network [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1872-1880.
[13] Gui-he QIN,Jun-feng HUANG,Ming-hui SUN. Text input based on two⁃handed keyboard in virtual environment [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(8): 1881-1888.
[14] Jun WANG,Yan-hui XU,Li LI. Data fusion privacy protection method with low energy consumption and integrity verification [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1657-1665.
[15] Feng-feng ZHOU,Yi-chi ZHANG. Unsupervised feature engineering algorithm BioSAE based on sparse autoencoder [J]. Journal of Jilin University(Engineering and Technology Edition), 2022, 52(7): 1645-1656.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!