Publication

HMM inference towards flexible speech recognition

Related concepts (36)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Statistical inference

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population. Inferential statistics can be contrasted with descriptive statistics. Descriptive statistics is solely concerned with properties of the observed data, and it does not rest on the assumption that the data come from a larger population.

Speech recognition

Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis.

General American English

General American English, known in linguistics simply as General American (abbreviated GA or GenAm), is the umbrella accent of American English spoken by a majority of Americans, encompassing a continuum rather than a single unified accent. In the United States it is often perceived as lacking any distinctly regional, ethnic, or socioeconomic characteristics, though Americans with high education, or from the North Midland, Western New England, and Western regions of the country are the most likely to be perceived as using General American speech.

Mixture model

In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. Formally a mixture model corresponds to the mixture distribution that represents the probability distribution of observations in the overall population.

Armenian alphabet

The Armenian alphabet (Հայոց գրեր, Hayoc’ grer or Հայոց այբուբեն, Hayoc’ aybuben), or more broadly the Armenian script, is an alphabetic writing system developed for Armenian and occasionally used to write other languages. It was developed around 405 AD by Mesrop Mashtots, an Armenian linguist and ecclesiastical leader. There are several inscriptions in Armenian lettering from Sinai and Nazareth that date to the beginning of the 5th century. The script originally had 36 letters; eventually, two more were adopted.

Standard English

In an English-speaking country, Standard English (SE) is the variety of English that has undergone substantial regularisation and is associated with formal schooling, language assessment, and official print publications, such as public service announcements and newspapers of record, etc. All linguistic features are subject to the effects of standardisation, including morphology, phonology, syntax, lexicon, register, discourse markers, pragmatics, as well as written features such as spelling conventions, punctuation, capitalisation and abbreviation practices.

Levenshtein distance

In information theory, linguistics, and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965.

Viterbi algorithm

The Viterbi algorithm is a dynamic programming algorithm for obtaining the maximum a posteriori probability estimate of the most likely sequence of hidden states—called the Viterbi path—that results in a sequence of observed events, especially in the context of Markov information sources and hidden Markov models (HMM). The algorithm has found universal application in decoding the convolutional codes used in both CDMA and GSM digital cellular, dial-up modems, satellite, deep-space communications, and 802.

Deep learning

Deep learning is part of a broader family of machine learning methods, which is based on artificial neural networks with representation learning. The adjective "deep" in deep learning refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

Confidence interval

In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated confidence level; the 95% confidence level is most common, but other levels, such as 90% or 99%, are sometimes used. The confidence level, degree of confidence or confidence coefficient represents the long-run proportion of CIs (at the given confidence level) that theoretically contain the true value of the parameter; this is tantamount to the nominal coverage probability.