Sunday, October 17, 2021

A compact deep learning model for Khmer handwritten text recognition

Bayram Annanurov, Norliza Mohd Noor 

Department of Computer Science, Paragon International University, Cambodia 

Department of Engineering, Razak Faculty of Technology and Informatics, Universiti Teknologi Malaysia, Malaysia

Abstract (of the Paper)

The motivation of this study is to develop a compact offline recognition model for Khmer handwritten text that would be successfully applied under limited access to high-performance computational hardware. Such a task aims to ease the ad-hoc digitization of vast handwritten archives in many spheres. Data collected for previous experiments were used in this work. The oneagainst-all classification was completed with state-of-the-art techniques. A compact deep learning model (2+1CNN), with two convolutional layers and one fully connected layer, was proposed. The recognition rate came out to be within 93-98%. The compact model is performed on par with the state-of-theart models. It was discovered that computational capacity requirements usually associated with deep learning can be alleviated, therefore allowing applications under limited computational power.

Link To the Page 






Friday, February 19, 2021

Optical character recognition system for Baybayin scripts using support vector machine

A new publishing related to SVM method on OCR case, "Optical character recognition system for Baybayin scripts using support vector machine" -  https://peerj.com/articles/cs-360/


Thanks for citation that to have more clearer that the method could work in some other cases.



This part is delight me and remind it back.




Abstract (of the paper)

 In 2018, the Philippine Congress signed House Bill 1022 declaring the Baybayin script as the Philippines’ national writing system. In this regard, it is highly probable that the Baybayin and Latin scripts would appear in a single document. In this work, we propose a system that discriminates the characters of both scripts. The proposed system considers the normalization of an individual character to identify if it belongs to Baybayin or Latin script and further classify them as to what unit they represent. This gives us four classification problems, namely: (1) Baybayin and Latin script recognition, (2) Baybayin character classification, (3) Latin character classification, and (4) Baybayin diacritical marks classification. To the best of our knowledge, this is the first study that makes use of Support Vector Machine (SVM) for Baybayin script recognition. This work also provides a new dataset for Baybayin, its diacritics, and Latin characters. Classification problems (1) and (4) use binary SVM while (2) and (3) apply the multiclass SVM classification. On average, our numerical experiments yield satisfactory results: (1) has 98.5% accuracy, 98.5% precision, 98.49% recall, and 98.5% F1 Score; (2) has 96.51% accuracy, 95.62% precision, 95.61% recall, and 95.62% F1 Score; (3) has 95.8% accuracy, 95.85% precision, 95.8% recall, and 95.83% F1 Score; and (4) has 100% accuracy, 100% precision, 100% recall, and 100% F1 Score.