Friday, September 9, 2016

New Online Demo Release - Word Segmentation for Khmer Unicode by NIPTICT

Khmer Word Segmentation is still in demand and hot topic among natural language processing topics for Khmer language.

There are some methods have been introduced so far but online tool to available for people to use it are still few (my collection here).

Why Word Segmentation is important?
In language processing, we need to identify clearly what are the words and sentences, our Khmer language we do not have space between word, the sentence goes without many spaces that's why it is hard for machine to understand it.

Segment the sentence into words nowadays we need big dictionary with method that could split each word with zero space as fast as we can.

Now NIPTICT, the institute just released its first demo for their method online.


This tool is very important to use in office or data entry for the website.

Another online tool that I usually use is with Kheng.info so now at least we have two available online tool to use.

For the explanation of the method that NIPTICT uses, I will find the update later.
Anyway, to join the research, you can submit yours at the conference of Khmer NLP from now until mid of October.