Friday, July 27, 2018

Naming Transliteration Tool (Khmer to Latin/English)

I just see that there is a tool by NIPTITC institute to help writing Khmer name to Latin written, it is very useful for Cambodian people to write their name in Latin characters.


Example here, the name in Khmer is កុសល ចំរើន the tool provide 3 kinds of written :

  1. Character Model: e.g. KOSOL CHAMREUN
  2. Syllable Model: e.g. KOSOL CHAMRAEUN
  3. SMT Model(*): e.g. KOL CHAMROEUN

So if you are looking for the Khmer transliteration tool, I think this research r&d tool, you can try: http://rnd.niptict.edu.kh/tran/.

Remark:
(*) I don't really find a source of translation, I think "SMT" should stand for: Statistical Machine Translation which is in another research by this organization.

Thursday, April 12, 2018

Segmentation - New Zero Width Space (ZWSP) Online Tool - ondra.cf by Danh Hong

Thank Danh Hong who always be with Khmer Unicode solution from font design, OCR... and now segmentation tool: ondra.cf

Danh Hong's Tool for ZWSP


Online tool and even the API available tools are required for bushing more product related in Khmer.
Mostly I use tool from kheng.info as I've been listing them in my list as I can see both tools are great to have in the community and hope for heavy content organization will support them for continuous development.

kheng.info

I've tried out both tools to see the result, there are some points in yellow remark base on the text:
ondra.cf vs kheng.info

Of course, base on above highlight, it would be better when training data is enough but I could see Danh Hong's tool made correctly for numeric data, although requires more data training to correct some concrete words such as country names as example.

Anyway, the tool will help our community growing.

Thanks everyone for hard work and share to us.