Text Detection on Images using Region-based Convolutional Neural Network


  • Hamsa D. Majeed Department of Information Technology, University of Human Development, Sulaymaniyah, Iraq




Text Detection, Region-based Convolutional Neural Network, Text Images


In this paper, a new text detection algorithm that accurately locates picture text with complex backgrounds in natural images is applied. The approach is based primarily on the region-based convolutional neural network anchor system, which takes into account the unique features of the text area, compares it to other object detection tasks, and turns the text area detection task into an object sensing task. Thus, the proposed text to be observed directly in the neural network’s convolutional characteristic map, and it can simultaneously predict the text/non-text score of the proposal and the coordinates of each proposal in the image. Then, we proposed an algorithm for the construction of the text line, to increase the text detection model accuracy and consistency. We found that our text detection operates accurately, even in multiple language detection functions. We also discovered that it meets the 2012 and 2014 International Conference on Document Analysis and Recognition thresholds of 0.86 F-measure and 0.78 F-measure, which clearly shows the consistency of our model. Our approach has been programmed and implemented using Python programming language 3.8.3 for Windows.


[1] W. Tao, D. J. Wu, A. Coates and A. Y. Ng. “End-to-end Text Detection with Convolutional Neural Networks”. Pattern Detection (ICPR), 2012 21st International Conference on IEEE, 2012.
[2] J. Max, A. Vedaldi and A. Zisserman. “Deep features for text spotting”. In: European Conference on Computer Vision. Springer, Cham, 2014.
[3] N. Lukáš and J. Matas. “Efficient Scene Text Localization and Detection with Local Character Refinement”. Document Analysis and Detection (ICDAR), 2015 13th International Conference on IEEE, 2015.
[4] M. Rodrigo, N, Thome, M. Cord, J. Fabrizio and B. Marcotegui. “Snoopertext: A Multiresolution System for Text Detection in Complex Visual Scenes”. Image Processing (ICIP), 2010 17th IEEE International Conference on IEEE, 2010.
[5] K. Dimosthenis, F. Shafait, S. Uchida, M. Iwamura, L. G. Bigorda, S. R. Mestre, J. Mas, D. F. Mota, J. A. Almazan and L. P. de las Heras. “ICDAR 2013 Robust Reading Competition”. Document Analysis and Detection (ICDAR), 2013 12th International Conference on IEEE, 2013.
[6] H. Weilin, Z. Lin, J. Yang and J. Wang. “Text Localization in Natural Images Using Stroke Feature Transform and Text Covariance Descriptors”. Computer Vision (ICCV), 2013 IEEE International Conference on IEEE, Sydney, NSW, Australia, 2013.
[7] W. Huang, Y. Qiao, and X. Tang. “Robust Scene Text Detection with Convolutional Neural Networks Induced MSER Trees”. Vol. 1. European Conference on Computer Vision (ECCV), 2014.
[8] Y. Xu-Cheng, X. Yin, K. Huang and H. W. Hao. “Robust Text Detection in Natural Scene Images”. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 5, pp. 970-983, 2014.
[9] E. Boris, E. Ofek and Y. Wexler. “Detecting Text in Natural Scenes with Stroke width Transform”. Computer Vision and Pattern Detection (CVPR), 2010 IEEE Conference on IEEE, 2010.
[10] T. Zhi, W. Huang, T. He, P. He and Y. Qiao. “Detecting Text in Natural Image with Connectionist Text Proposal Network”. In: European Conference on Computer Vision. Springer, Cham, 2016.
[11] T. Shangxuan, Y. Pan, C. Huang, S. Lu, K. Yu and C. L. Tan. “Text Flow: A Unified Text Detection System in Natural Scene Images”. Proceedings of the IEEE International Conference on Computer Vision, 2015.
[12] Z. Zheng, C. Zhang, W. Shen, C. Yao, W. Liu and X. Bai. “Multi-oriented Text Detection with Fully Convolutional Networks”. arXiv, 2016.
[13] N. Alexander and L. Van Gool. “Efficient Non-maximum Suppression”. Vol. 3. Pattern Detection. 18th International Conference on IEEE, 2006.
[14] Y. Cong, X. Bai1, W. Liu, Y. Ma and Z. Tu. “Detecting Texts of Arbitrary Orientations in Natural Images”. Computer Vision and Pattern Detection (CVPR), 2012 IEEE Conference on IEEE, 2012.
[15] H. Pan, W. Huang, Y. Qiao, C. C. Loy and X. Tang. “Reading Scene Text in Deep Convolutional Sequences”. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), 2016.
[16] H. Tong, W. Huang, Y. Qiao and J. Yao. “Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network”. arXiv, 2016.
[17] L. Minghui, B. Shi, X. Bai, X. Wang and W. Liu. “TextBoxes: A Fast Text Detector with a Single Deep Neural Network”. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2017.
[18] R. Shaoqing, K. He, R. Girshick and J. Sun. “Faster R-CNN: Towards real-time object detection with region proposal networks”. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017.