Friday, September 9, 2016

New Online Demo Release - Word Segmentation for Khmer Unicode by NIPTICT

Khmer Word Segmentation is still in demand and hot topic among natural language processing topics for Khmer language.

There are some methods have been introduced so far but online tool to available for people to use it are still few (my collection here).

Why Word Segmentation is important?
In language processing, we need to identify clearly what are the words and sentences, our Khmer language we do not have space between word, the sentence goes without many spaces that's why it is hard for machine to understand it.

Segment the sentence into words nowadays we need big dictionary with method that could split each word with zero space as fast as we can.

Now NIPTICT, the institute just released its first demo for their method online.


This tool is very important to use in office or data entry for the website.

Another online tool that I usually use is with Kheng.info so now at least we have two available online tool to use.

For the explanation of the method that NIPTICT uses, I will find the update later.
Anyway, to join the research, you can submit yours at the conference of Khmer NLP from now until mid of October.

Tuesday, August 23, 2016

Paper: Experimental Comparison of the Performance of SVMs

The research paper on:

Experimental Comparison of the Performance of SVMs with Different Kernel Functions for Recognizing Arabic Characters


said Ghoniemy, Sayed Fadel, M. Asif

Abstract


A considerable progress in the recognition of Latin and Chinese characters has been achieved. By contrast, Arabic Optical character Recognition is still lagging. This is because Arabic language is a cursive language, written from right to left, and each character has different forms according to its position in the word. Support vector machines using kernel classifiers represent a typical approach for character recognition. Choosing the most appropriate kernel highly depends on the problem at hand – and fine tuning its parameters can easily become a tedious and cumbersome task. The present study is devoted to an experimental comparison of the performance of SVM machines with different kernel functions for recognizing Arabic Characters. Two groups of kernel functions were used throughout the study, each group contains 7 kernel functions. The obtained results show that, in the radial basis group, Laplacian kernel gives the best results. In the special functions group, the T-Student approach gives the best results. However, combing both kernels did not yield better performance.


[..]
Sok, P. and Taing, N., "Support Vector Machine (SVM) Based Classifier For Khmer Printed Character-set Recognition", Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA) (pp. 1-9). IEEE. December 2014.
[..] 

Thanks for cited my Research on SVM related method

Sunday, August 14, 2016

Khmer NLP Conference 2016


The upcoming event, Khmer Natural Language Processing Conference (Khmer NLP Conference 2016) calls for paper which is related tot he natural language processing, especially to solve problem of our Khmer language.

As presented in the poster banner, there are a lot of topics that students, professional or private sector should be participating to help together solving our language issue, promoting research and encouraging more people to join solving the problem.

This year beside research papers, you can also present your research or products as poster to exhibit during the conference. Please check official website for detail: http://khmernlp.org