Burmese ( mjəmà bhasaca.kaː) is a Sino-Tibetan language spoken in Myanmar (also known as Burma), where it is an official language, lingua franca, and the native language of the Burmans, the country's principal ethnic group. Burmese is also spoken by the indigenous tribes in Chittagong Hill Tracts (Rangamati, Bandarban, Khagrachari, Cox's Bazar) in Bangladesh, and in Tripura state in Northeast India. Although the Constitution of Myanmar officially recognizes the English name of the language as the Burmese language, most English speakers continue to refer to the language as Burmese, after Burma, the country's once previous and currently co-official name. Burmese is the common lingua franca in Myanmar, as the most widely-spoken language in the country. In 2007, it was spoken as a first language by 33 million, primarily the Burman people and related ethnic groups, and as a second language by 10 million, particularly ethnic minorities in Myanmar and neighboring countries. In 2022, the Burmese-speaking population was 38.8 million.
Burmese is the native language of the Bamar people and related sub-ethnic groups of the Bamar, as well as that of some ethnic minorities in Burma like the Mon. In 2007, Burmese was spoken by 33 million people as a first language. Burmese is spoken as a second language by another 10 million people, particularly ethnic minorities in Burma and those in neighbouring countries.
Burmese is a tonal, pitch-register (as well as social-register), and syllable-timed language, largely monosyllabic and agglutinative with a subject–object–verb word order. It is a member of the Lolo-Burmese grouping of the Sino-Tibetan language family. The Burmese alphabet is ultimately descended from a Brahmic script, either the Kadamba or Pallava alphabets.
As far as natural language processing research dealing with interaction of computers and Burmese human-spoken language is concerned, during the period spanning more than 25 years, from 1990 to 2016, notable work has been done and annotated in the areas of Burmese language word identification, segmentation, disambiguation, collation, semantic parsing and tokenization followed by part-of-speech tagging, machine translation systems, text keying/input, text recognition and text display methods.