首页 > 论文 > 激光与光电子学进展 > 56卷 > 6期(pp:61003--1)

基于帧特征及维特比解码的手写体与印刷体分类

Discrimination of Handwritten and Printed Texts Based on Frame Features and Viterbi Decoder

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

为有效区分手写体与印刷体,提出了一种基于卷积神经网络隐层帧特征的分类方法。基于卷积神经网络,提取隐层帧特征,利用高斯混合模型结合隐马尔可夫模型的方法对该特征进行建模,再通过维特比解码算法判定每帧特征的类别。基于帧特征的识别结果,结合文本行图像信息对识别结果进行后处理,确定最终的手写体和印刷体的区域。在签名文书类文本行图像上,相比基线,所提方法对手写体与印刷体分类的识别率提升10.8%和27.57%。在自然场景、表格和带噪文档行验证了其有效性。

Abstract

To effectively distinguish the handwritten and printed texts, a discrimination method is proposed based on the hidden layer frame features of a convolutional neural network. The hidden layer frame feature is extracted by the convolutional neural network. The Gaussian mixture model is first combined with the hidden Markov model to model the features, and then the Viterbi decoding algorithm is used to determine the category of each frame feature. Based on the recognition results of the frame features, the recognition results are post-processed in combination with the image information. The final handwritten and printed text areas are determined. For the signature document line images, relative to the baseline, the discrimination accuracy of handwritten and printed texts by the proposed method increases by 10.8% and 27.57%, respectively. The effectiveness of the proposed method is verified with the natural scenes, tables and noisy documents.

Newport宣传-MKS新实验室计划
补充资料

中图分类号:TP391.4

DOI:10.3788/lop56.061003

所属栏目:图像处理

基金项目:国家自然科学基金青年基金(61602006)、安徽省高校省级自然科学研究重点项目(KJ2017A934)、安徽省高校省级自然科学研究重点项目(KJ2013A217)

收稿日期:2018-08-21

修改稿日期:2018-09-28

网络出版日期:2018-10-10

作者单位    点击查看

林琴:合肥师范学院计算机学院, 安徽 合肥 230601
夏俊峰:安徽大学计算机学院, 安徽 合肥 230039
涂铮铮:安徽大学计算机学院, 安徽 合肥 230039
郭玉堂:合肥师范学院计算机学院, 安徽 合肥 230601

联系人作者:林琴(linqin@hfnu.edu.cn)

【1】Ye Z, Bai L. Hyperspectral image classification algorithm based on Gabor feature and locality-preserving dimensionality reduction[J]. Acta Optica Sinica, 2016, 36(10): 1028003.
叶珍, 白璘. 基于Gabor特征与局部保护降维的高光谱图像分类算法[J]. 光学学报, 2016, 36(10): 1028003.

【2】Wang D D, Li Y N. Video fingerprint algorithm based on spatio-temporal deep neural network[J]. Laser & Optoelectronics Progress, 2018, 55(1): 011006.
汪冬冬, 李岳楠. 基于时空深度神经网络的视频指纹算法[J]. 激光与光电子学进展, 2018, 55(1): 011006.

【3】Ding H, Zhang X F. Connected handwritten and printed text discrimination in uneven lighted images[J]. Computer Engineering and Design, 2012, 33(12): 4634-4638.
丁红, 张晓峰. 非均匀光照图像中粘连手写体和印刷体的辨别[J]. 计算机工程与设计, 2012, 33(12): 4634-4638.

【4】Yu X Y, Guo Y B, Chen G, et al. Real-time point feature extraction based on connected components labeling and distributed computing[J]. Acta Optica Sinica, 2015, 35(2): 0210001.
于潇宇, 郭玉波, 陈刚, 等. 基于点目标连通域标记的实时特征提取及其分布式运算[J]. 光学学报, 2015, 35(2): 0210001.

【5】Koyama J, Hirose A, Kato M. Local-spectrum-based distinction between handwritten and machine-printed characters[C]∥2018 Conference on Image Processing,October 12-15, 2008, San Diego, CA, USA. New York: IEEE, 10422955.

【6】Kavallieratou E, Stamatatos S. Discrimination of machine-printed from handwritten text using simple structural characteristics[C]∥2004 Conference on Pattern Recognition, August 26-26, 2004, Cambridge, UK. New York: IEEE, 8213163.

【7】Bristow H, Lucey S. Why do linear SVMs trained on HOG features perform so well?[EB/OL]. (2014-06-10)[2018-08-20]. https:∥arxiv.org/abs/1406.2419.

【8】Jiang B, Song Y, Wei S, et al. Deep bottleneck features for spoken language identification[J]. PLoS One, 2014, 9(7): e100795.

【9】Torres-Carrasquillo P A, Singer E, Kohler M A, et al. Approaches to language identification using Gaussian mixture models and shifted delta cepstral features[C]∥2002 Conference on Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA.[S. l. : s. n.], 2001: 89-92.

【10】Guan D J, Huang H J. Off-line recognition of realistic Chinese handwriting using segmentation-free strategy[J]. Pattern Recognition, 2009, 42(1): 167-182.

【11】Li R, Zhuo Z, Li H. The research of speaker diarization based on BIC and GPLDA[J]. Journal of University of Science and Technology of China, 2015, 45(4): 286-293.
李锐, 卓著, 李辉. 基于BIC和GPLDA的说话人分离技术研究[J]. 中国科学技术大学学报, 2015, 45(4): 286-293.

【12】Russakovsky O, Deng J, Su H, et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252.

【13】Haykin S, Kosko B. Gradient-based learning applied to document recognition[M]. New Jersey: Wiley-IEEE Press, 2009: 306-351.

【14】Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups[J]. IEEE Signal Processing Magazine, 2012, 29(6): 82-97.

【15】Du J, Wang Z R, Zhai J F, et al. Deep neural network based hidden Markov model for offline handwritten Chinese text recognition[C]∥2016 Conference on Pattern Recognition, December 4-8,2016, Cancun, Mexico. New York: IEEE, 16835646.

【16】Yosinski J, Clune J, Nguyen A, et al. Understanding neural networks through deep visualization[EB/OL]. (2015-06-22)[2018-08-20]. https:∥arxiv.org/abs/1506.06579.

【17】Bishop C M. Pattern recognition and machine learning[M]. New York: Springer, 2006: 452-472.

【18】Chang C C, Lin C J. LIBSVM: A library for support vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 1-27.

【19】Chen Y L, Wu B F. A multi-plane approach for text segmentation of complex document images[J]. Pattern Recognition, 2009, 42(7): 1419-1444.

引用该论文

Lin Qin,Xia Junfeng,Tu Zhengzheng,Guo Yutang. Discrimination of Handwritten and Printed Texts Based on Frame Features and Viterbi Decoder[J]. Laser & Optoelectronics Progress, 2019, 56(6): 061003

林琴,夏俊峰,涂铮铮,郭玉堂. 基于帧特征及维特比解码的手写体与印刷体分类[J]. 激光与光电子学进展, 2019, 56(6): 061003

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF