Publication

Pronunciation models and their evaluation using confidence measures

Related concepts (33)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Word

A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no consensus among linguists on its definition and numerous attempts to find specific criteria of the concept remain controversial. Different standards have been proposed, depending on the theoretical background and descriptive context; these do not converge on a single definition.

Markov property

In probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process, which means that its future evolution is independent of its history. It is named after the Russian mathematician Andrey Markov. The term strong Markov property is similar to the Markov property, except that the meaning of "present" is defined in terms of a random variable known as a stopping time. The term Markov assumption is used to describe a model where the Markov property is assumed to hold, such as a hidden Markov model.

Levenshtein distance

In information theory, linguistics, and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965.

Linguistic prescription

Linguistic prescription, or prescriptive grammar, is the establishment of rules defining preferred usage of language. These rules may address such linguistic aspects as spelling, pronunciation, vocabulary, syntax, and semantics. Sometimes informed by linguistic purism, such normative practices often suggest that some usages are incorrect, inconsistent, illogical, lack communicative effect, or are of low aesthetic value, even in cases where such usage is more common than the prescribed usage.

Statistical inference

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population. Inferential statistics can be contrasted with descriptive statistics. Descriptive statistics is solely concerned with properties of the observed data, and it does not rest on the assumption that the data come from a larger population.

Standard language

A standard language (also standard variety, standard dialect, and standard) is a language variety that has undergone substantial codification of grammar and usage, although occasionally the term refers to the entirety of a language that includes a standardized form as one of its varieties. Typically, the language varieties that undergo substantive standardization are the dialects associated with centers of commerce and government.

Markov information source

In mathematics, a Markov information source, or simply, a Markov source, is an information source whose underlying dynamics are given by a stationary finite Markov chain. An information source is a sequence of random variables ranging over a finite alphabet , having a stationary distribution. A Markov information source is then a (stationary) Markov chain , together with a function that maps states in the Markov chain to letters in the alphabet .

Armenian alphabet

The Armenian alphabet (Հայոց գրեր, Hayoc’ grer or Հայոց այբուբեն, Hayoc’ aybuben), or more broadly the Armenian script, is an alphabetic writing system developed for Armenian and occasionally used to write other languages. It was developed around 405 AD by Mesrop Mashtots, an Armenian linguist and ecclesiastical leader. There are several inscriptions in Armenian lettering from Sinai and Nazareth that date to the beginning of the 5th century. The script originally had 36 letters; eventually, two more were adopted.

Edit distance

In computational linguistics and computer science, edit distance is a string metric, i.e. a way of quantifying how dissimilar two strings (e.g., words) are to one another, that is measured by counting the minimum number of operations required to transform one string into the other. Edit distances find applications in natural language processing, where automatic spelling correction can determine candidate corrections for a misspelled word by selecting words from a dictionary that have a low distance to the word in question.

Bayesian inference

Bayesian inference (ˈbeɪziən or ˈbeɪʒən ) is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, and law.