吉林大学学报(工学版) ›› 2013, Vol. 43 ›› Issue (03): 740-746.doi: 10.7964/jdxbgxb201303030

• 论文 • 上一篇    下一篇

部件字典结合时分方向特征的手写维吾尔字符识别

许亚美, 卢朝阳, 李静   

  1. 西安电子科技大学 综合业务网理论及关键技术国家重点实验室,西安 710071
  • 收稿日期:2012-05-29 出版日期:2013-05-01 发布日期:2013-05-01
  • 通讯作者: 卢朝阳(1963-), 男, 教授, 博士生导师.研究方向:智能信号处理, 模式识别. E-mail:zhylu@xidian.edu.cn E-mail:zhylu@xidian.edu.cn
  • 作者简介:许亚美(1978-), 女, 博士研究生.研究方向:图像处理与模式识别, 手写文字识别. E-mail:ymxu@mail.xidian.edu.cn
  • 基金资助:

    国家自然科学基金项目(60872141); 中央高校基本科研业务费专项资金项目(K50510010007); 华为科技基金项目(HITC2011023).

Handwritten Uyghur character recognition based on radical dictionary and time division direction feature

XU Ya-mei, LU Zhao-yang, LI Jing   

  1. State Key Laboratory of Integrated Services Networks, Xidian University, Xi'an 710071,China
  • Received:2012-05-29 Online:2013-05-01 Published:2013-05-01

摘要: 针对联机手写维吾尔128类变体字符, 提出了一种基于部件字典和时分方向特征的识别算法.该算法首先结合连笔分析,将字符分解为主体、附加和点三类部件,建立手写维吾尔字符的部件字典,有效解决联机手写维吾尔字符的笔顺连笔自由问题;然后,为减轻手写字符拓扑变形造成的干扰,对单个部件提取一种新的联机特征-时分方向特征,该特征在模糊域提取方向链码然后按时序划分统计,并通过检测和调整短时变动进行抖动校正;最后,设计不同分类器对各部件进行匹配,利用匹配测度分布估计各部件权重,并通过加权朴素贝叶斯融合得到字符识别结果.实验结果表明,该算法能有效地识别128类无约束手写维吾尔字符,在包含13 056个样本的手写体维吾尔字符数据库上的平均识别率为93.15%.

关键词: 计算机应用, 手写字符识别, 维吾尔语, 部件字典, 方向特征, 朴素贝叶斯

Abstract: For 128 Uyghur characters, a handwritten recognition algorithm based on radical dictionary and time division direction feature is proposed. First, the radical dictionary is established by decomposing the Uyghur characters as three type radicals, main, affix and dot. The problem of stroke order and connection can be solved by analysis of the connected and broken strokes. Second, a new online statistical feature, named time division directional feature, is extracted from every radical to reduce the interference from handwritten topological deformation. The statistical feature is designed by dividing freeman codes according to time sequence, with its short term variations being found and adjusted to correct the dither phenomenon. Finally, different classifications are designed for various types of radicals. With the radical coefficient estimated according to the coefficient distribution, the character recognition result is obtained by fusing the outputs of all classifications using the weighted naive Bayesian algorithm. Experiment results show that the algorithm can effectively identify 128 unconstrained handwritten Uyghur characters. An average recognition rate of 93.15% is achieved on the Uyghur character database containing 13056 samples.

Key words: computer application, handwriting recognition, uyghur language, radical dictionary, directional feature, naive Bayes

中图分类号: 

  • TP391.4
[1] Espana-Boquera S, Castro-Bleda M J, Gorbe-Moya J. Improving offline handwritten text recognition with hybrid HMM/ANN models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(4): 767-779.

[2] Assabie Y, Biqun J. Offline handwritten Amharic word recognition[J]. Pattern Recognition Letters, 2011, 32(8): 1089-1099.

[3] Razzak M I, Anwar F, Husain S A, et al. HMM and fuzzy logic: a hybrid approach for online Urdu script-based languages character recognition[J]. Knowledge-Based Systems, 2010, 23(1): 914-923.

[4] Niu Xiao-xiao, Suen C Y. A novel hybrid CNN-SVM classifier for recognizing handwritten digits[J]. Pattern Recognition, 2012, 45(4): 1318-1325.

[5] Hong Lee, Verma B. Binary segmentation algorithm for English cursive handwriting recognition[J]. Pattern Recognition, 2012, 45(4): 1306-1317.

[6] 王华, 丁晓青, 哈力木拉提. 多字体多字号印刷维吾尔文字符识别[J]. 清华大学学报, 2004, 44(7): 946-949. Wang Hua, Ding Xiao-qing, Halimurat. Multi-font multi-size printed Uyghur character recognition[J]. Journal of Tsinghua University, 2004, 44(7): 946-949.

[7] Margner V, Abed H E. ICDAR 2011-Arabic handwriting recognition competition//Proc of the 2011 11th International Conference on Document Analysis and Recognition (ICDAR). Beijing: IEEE, 2011: 1444-1448.

[8] Al-Jamimi H A, Mahmoud S A. Arabic character recognition using Gabor filters//Sobh T. Innovations and Advances in Computer Sciences and Engineering. Netherlands: Springer, 2010: 113-118.

[9] Zagloul R I, Alrawshdeh E F, Bader D M K. Multilevel classifier in recognition of handwritten Arabic characters[J]. Journal of Computer Sciences, 2011, 7(4): 512-518.

[10] Sternby J, Morwing J, Andersson J, et al. On-line arabic handwriting recognition with templates[J]. Pattern Recognition, 2009, 42(12): 3278-3286.

[11] Jin Lian-wen, Wei Gang. Handwritten Chinese character recognition with directional decomposition cellular features[J]. Circuits, Systems and Computers, 1998, 8(4): 517-524.

[12] Verma B, Blumenstein M, Ghosh M. A novel approach for structural feature extraction: contour vs. direction[J]. Pattern Recognition Letters, 2004, 25(9): 975-988.

[13] Kimura F, Takashina K, Tsuruoka S, et al. Modified quadratic discriminant functions and its application to Chinese character recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, 9(1): 149-153.

[14] 王利民, 李雄飞, 张海龙. 基于广义信息论的贝叶斯分类器动态建模[J]. 吉林大学学报:工学版, 2009, 39(3): 776-780. Wang Li-min, Li Xiong-fei, Zhang Hai-long. Dynamic modeling of bayesian classifier based on general information theory[J]. Journal of Jilin University (Engineering and Technology Edition), 2009, 39(3): 776-780.

[15] Harry Z, Sheng S. Learning weighted naive Bayes with accurate ranking//Proc of the 4th IEEE International Conference on Data Mining (ICDM). Fredericton: IEEE, 2004: 567-570.
[1] 刘富,宗宇轩,康冰,张益萌,林彩霞,赵宏伟. 基于优化纹理特征的手背静脉识别系统[J]. 吉林大学学报(工学版), 2018, 48(6): 1844-1850.
[2] 王利民,刘洋,孙铭会,李美慧. 基于Markov blanket的无约束型K阶贝叶斯集成分类模型[J]. 吉林大学学报(工学版), 2018, 48(6): 1851-1858.
[3] 金顺福,王宝帅,郝闪闪,贾晓光,霍占强. 基于备用虚拟机同步休眠的云数据中心节能策略及性能[J]. 吉林大学学报(工学版), 2018, 48(6): 1859-1866.
[4] 赵东,孙明玉,朱金龙,于繁华,刘光洁,陈慧灵. 结合粒子群和单纯形的改进飞蛾优化算法[J]. 吉林大学学报(工学版), 2018, 48(6): 1867-1872.
[5] 刘恩泽,吴文福. 基于机器视觉的农作物表面多特征决策融合病变判断算法[J]. 吉林大学学报(工学版), 2018, 48(6): 1873-1878.
[6] 欧阳丹彤, 范琪. 子句级别语境感知的开放信息抽取方法[J]. 吉林大学学报(工学版), 2018, 48(5): 1563-1570.
[7] 刘富, 兰旭腾, 侯涛, 康冰, 刘云, 林彩霞. 基于优化k-mer频率的宏基因组聚类方法[J]. 吉林大学学报(工学版), 2018, 48(5): 1593-1599.
[8] 桂春, 黄旺星. 基于改进的标签传播算法的网络聚类方法[J]. 吉林大学学报(工学版), 2018, 48(5): 1600-1605.
[9] 刘元宁, 刘帅, 朱晓冬, 陈一浩, 郑少阁, 沈椿壮. 基于高斯拉普拉斯算子与自适应优化伽柏滤波的虹膜识别[J]. 吉林大学学报(工学版), 2018, 48(5): 1606-1613.
[10] 车翔玖, 王利, 郭晓新. 基于多尺度特征融合的边界检测算法[J]. 吉林大学学报(工学版), 2018, 48(5): 1621-1628.
[11] 赵宏伟, 刘宇琦, 董立岩, 王玉, 刘陪. 智能交通混合动态路径优化算法[J]. 吉林大学学报(工学版), 2018, 48(4): 1214-1223.
[12] 黄辉, 冯西安, 魏燕, 许驰, 陈慧灵. 基于增强核极限学习机的专业选择智能系统[J]. 吉林大学学报(工学版), 2018, 48(4): 1224-1230.
[13] 傅文博, 张杰, 陈永乐. 物联网环境下抵抗路由欺骗攻击的网络拓扑发现算法[J]. 吉林大学学报(工学版), 2018, 48(4): 1231-1236.
[14] 曹洁, 苏哲, 李晓旭. 基于Corr-LDA模型的图像标注方法[J]. 吉林大学学报(工学版), 2018, 48(4): 1237-1243.
[15] 侯永宏, 王利伟, 邢家明. 基于HTTP的动态自适应流媒体传输算法[J]. 吉林大学学报(工学版), 2018, 48(4): 1244-1253.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!