Isaac Scientific Publishing

Frontiers in Signal Processing

Research on Tibetan Handwritten Numerals Recognition Based on TextCaps Model with Few Training Sample

Download PDF (384.9 KB) PP. 107 - 113 Pub. Date: October 1, 2020

DOI: 10.22606/fsp.2020.44005

Author(s)

  • Hongli Wei
    College of Electrical Information Engineering, Southwest Minzu University, Chengdu, Sichuan, China
  • Xiang Qiang*
    College of Electrical Information Engineering, Southwest Minzu University, Chengdu, Sichuan, China

Abstract

In recent years, machine learning has been widely used in image recognition while conventional models such as linear classifiers, K-nearest neighbors, non-linear classifiers, and Support Vector Machines (SVM) all require a large number of training samples to train the model to achieve good results. Previous researches on handwritten numerals recognition in Tibetan also need a large number of training samples to train the model. Therefore, this paper conducts training and recognition research on TextCaps model proposed by Vinoj Jayasundara et al., by using Tibetan handwritten numerals data set with a few training samples. In this paper, the training samples of 200 tags for each class are used to add random noise to the original image instantiation parameters and generate new image data sets through the Capsule Network (CapsNet) and its decoder network to achieve the purpose of data expansion. Finally, the character images are classified by the a new model.

Keywords

machine learning, image recognition, CapsNet, TextCaps

References

[1] 1. Y. Lee, “Handwritten digit recognition using k nearest-neighbor, radial-basis function, and back propagation neural networks”. Neural Computation 3 (1991) 440–449.

[2] 2. V.Jayasundara, S. Jayasekara, H. Jayasekara, J. Rajasegaran, S. Seneviratne,and R. Rodrigo ,“Text-Caps: Handwritten Character Recognition with Very Small Datasets”, in 2019 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 254-262). IEEE 2019.

[3] 3. G.E. Hinton, A. Krizhevsky,and S.D Wang, “Transforming auto-encoders”, in ICANN, Berlin, Heidelberg (2011) 44–51

[4] 4. S. Sabour, N. Frosst,, G.E. Hinton, “Dynamic routing between capsules”, in NIPS, Long Beach, CA (2017)3856– 3866

[5] 5. L.N. Smith, “Cyclical learning rates for training neural networks” In: Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on, IEEE (2017) 464–472

[6] 6. X. Wuji, S. Chajia, Z. Ji, G. Cairang and H. Cairang,“Tibetan handwritten numeral recognition based on convolutional neural network”, Modern Electronics Technique, Vol. 42 No. 5, 10.16652/j.issn.1004-373x,2019.

[7] 7. M.D. Zeiler, D. Krishnan, G.W. Taylor,and R. Fergus, “Deconvolutional networks”, in CVPR, San Francisco, CA (2010) 2528–2535

[8] 8. G. Xavier , A. Bordes , and Y. Bengio, “Deep Sparse Rectifier Neural Networks”, Journal of Machine Learning Research 15(2011):315-323.

[9] 9. A. Polesel, A. Ramponi, V.J. Mathews, “Image enhancement via adaptive unsharp masking”, IEEE Transactions on Image Processing 9 (2000) 505–510

[10] 10. Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, “Image quality assessment”, From error visibility to structural similarity. Trans. Img. Proc. 13 (2004) 600–612