If you are lucky enough 🙂 to have not one, but two alphabets in daily use, your regular task in programming will be transliteration – transformation of text from one script (alphabet) to another.
In Serbia, we are using Latin as well as Cyrillic alphabet (and that is not same Cyrillic as Russian one) and common task is conversion from one to another and vice-versa.
This is not too complicated request; you can easily create necessary procedures; however, there is a better way:
Microsoft Transliteration Utility (MTU) is not widely known, but very useful tool for just that purpose: transliteration. It can easily transliterate text either typed in a text box or from one file to another.
There is set of predefined translations:
- Serbian Cyrillic to Latin / Serbian Latin to Cyrillic
- Bosnian Cyrillic to Latin / Bosnian Latin to Cyrillic
- Hangul to Romanization
- Inuktitut to Romanization / Romanization to Inuktitut
- Malayalam to Romanization / Romanization to Malayalam
You are not limited to above set; you can easily create your own translations, using Module Development Console:
(click on image for larger version)
Creating simple textual file, you can use full power of MTU’s parsing engine: definitions of input and output characters, rules for transliteration including definitions of new states for translation state machine.
This is not the end – you can even use MTU programmatically (although please check EULA for commercial usage):
- Add reference to MSTranslitTools.DLL (it can be found in %programfiles%Microsoft Transliteration Utility)
- Add using System.NaturalLanguage.Tools;
- Current translation files (.tms) can be found in %CommonProgramFiles%TransliterationModulesMicrosoft
- Here is simple code fragment to demonstrate:
TransliteratorSpecification specification =
TransliteratorSpecification.FromSpecificationFile("Serbian Latin to Cyrillic.tms");
Transliterator transliterator = Transliterator.FromSpecification(specification);
string rezultat = transliterator.Transliterate("Vesic.Org");
Console.WriteLine(rezultat);
Good tool, saved me some time, and nice blog 🙂
Do you happen to have or know where to get Russian files? For start I would need at least Russian to Romanization.
Unfortunately, no, but it is rather easy to create them, in simple text editor (which support Unicode characters).
Hey Dejan…
Nice Blog you have here
And thanks for transliteration, it really helped me…
I have one more question about it.
Mogu i na srpskom al ne bi svi razmujeli / I can ask it in serbian but not everyone would understand 🙂
Anyway do you know if there is a way to import for example ‘Serbian Cyrillic to Latin.tms’ and ‘Serbian Latin to Cyrillic.tms’ in Visual Studio so that when the program is compiled it doesn’t need these files in the same folder where is .exe file…
Now code line TransliteratorSpecification.FromSpecificationFile(“Serbian Latin to Cyrillic.tms”) won’t work unless this file in next to .exe application and in the same folder with it.
And I won’t this file to be part of application itself so that I don’t need anything except .exe file.
I can make folder in VS and add these file there but how to connect them then with this code…
I hope you understood ?
Yes, I understand and this should not be a problem – if you use Red Gate’s .Net Reflector, you will see that there is also method:
TransliteratorSpecification.FromSpecificationFile(TextReader)
So, you can use some Memory stream and info stored in resource file to create TextReader on top of it and than process data.
Hi Dejan, your blog is a revelation (the biking part too:)!
I am looking for a Cyrilic->Latin transliteration tool ready to be used on Vista 64. Can someone help please? daptation to serbian keyborad would also be of help. Thanks
Thanks 🙂
What is exactly problem of using MTU on Vista 64?
Also, what kind of adaptation? If you are talking about input keyboards, try with:
http://www.vesic.org/blog/programi/nasa-slova-na-us-tastaturi-resenje-2005-e/#vista
Well, nothing, I guess, but installation requirements do not specifically refer to Vista 64 (only XP and W 2003)and sometimes these things don’t work on Vista, and I don’t know how to fix them (neither do I have the time:D).
I guess I’d like someone to tell me it will work before I install it. And – don’t laugh – I don’t even know if I have Microsoft .NET Framework v1.1 installed (also a requirement).
I’ll try your input keyboard link, thanks.
You do not have .Net 1.1 install – it is separate download and installation (from Microsoft site).
After that, installation of MTU is straightforward.
Thks again
Pozdrav Dejane!
Imam jedno pitanje. Jeli moguće u Vb6 koristiti ćirilicu ( textboxes, labels i dr.) Ako je moguće kako? Treba mi jer radim “Crkvene knjige za SPC u Puli”. Pozdrav iz sunčane Pule!
VB6 se ne snalazi baš sa Unicode “Out of the box” ali uz upotrebu odgovarajućih kontrola, može se izvesti.
U firmi u kojoj radim za to koristimo Unitoolbox i radi dosta lepo i to na velikom broju mesta. (doduše i tu ima ograničenja, ali s obzirom na starost VB-a 🙂 nije strašno)
ok..hvala puno
Zdravo Dejane!
Opet ja. I kada pokrenem Transliteration Utility ja dobijem gresku ‘Aplication has generated an exception that could not be handled.’ Pozz.
Potrudi se da imaš instaliran .Net Framework 1.1 sa svim zakrpama.
Posle toga uradi reinstall aplikacije.
🙂 Hvala Dejane!!!
Instalirala SP1 za 1.1. .Net i sve ok!! Hvala!! Pozz
Dejane, kako da nateram translit da radi i MS Office 2010 ?
Nekako sam uspeo u MS Office 2007, ali u 2010 necje !! Ili je to mozda do Windowsa 7 (u njemu nema .NET 1.1 vec .NET 3.5)
To je sigurno do Windows 7 – probaj da instaliraš .Net 1.1 pa onda reinstall od MTU
Imam mali problem sa ovim add in-om. Naime, instalirao sam ga najnormalnijei u MS Office 2003 i pri prvom ulasku u Word pojavljju mi se ikonica za transliteration koju prevucem u toolbar. Kasnije kad se vratim u Word doc vise ne postoji na toolbar-u. Da li neko ima resenje? Pri tom u Power pointu radi ko zmaj 🙂
Nisam siguran da je to najbolja upotreba za ovu aplikaciju.
Za Word preporučujem Yu Cir Lat ’08 – vrlo koristan skup makroa.
Cak mi ni on ne funkcionise u wordu. Pokusavao sam svasta ali ne vredi.
Uskoro sam komputer bacao kroz prozor, spasio si mu “život”. Puno ti hvala 🙂
Zdravo Dejane,
kako da prilikom ClickOnce deployment-a ukljucim i tms fajl? Znam da to treba uraditi preko aplication i deployment manifest fajla ali ne znam kako i da li uopste moze?
Zdravo Dejane,
uspela sam da nađem rešenje za moj problem. Fajl treba uključiti u apl., staviti Bild Action-> Content, Copy to output directory ->Copy if never. Pozz!
Eto, neki put treba samo malo sačekati sa odgovorom 🙂
🙂 Pa i to je ponekad konstruktivno (sačekati sa odgovorom) po onogo koji pita. Verovatno nas određeni ljudi inspirišu. 🙂 Pozz
Thanks!!!
Može li neki link s dokumentacijom o tome kako se MTU može koristiti iz .NET aplikacije?
Sa ovim što si ovde naveo uspeo sam da odradim posao (hvala!), ali voleo bih da saznam malo više o tome, za slučaj mi nekad ubuduće ponovo zatreba.