Publication

HMM2- Extraction of Formant Features and their Use for Robust ASR

Related concepts (30)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Named-entity recognition

Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc. Most research on NER/NEE systems has been structured as taking an unannotated block of text, such as this one: Jim bought 300 shares of Acme Corp.

Origin of language

The origin of language (spoken and signed, as well as language-related technological systems such as writing), its relationship with human evolution, and its consequences have been subjects of study for centuries. Scholars wishing to study the origins of language must draw inferences from evidence such as the fossil record, archaeological evidence, contemporary language diversity, studies of language acquisition, and comparisons between human language and systems of communication existing among animals (particularly other primates).

Facial recognition system

A facial recognition system is a technology potentially capable of matching a human face from a or a video frame against a database of faces. Such a system is typically employed to authenticate users through ID verification services, and works by pinpointing and measuring facial features from a given image. Development began on similar systems in the 1960s, beginning as a form of computer application. Since their inception, facial recognition systems have seen wider uses in recent times on smartphones and in other forms of technology, such as robotics.

Vowel

A vowel is a syllabic speech sound pronounced without any stricture in the vocal tract. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness and also in quantity (length). They are usually voiced and are closely involved in prosodic variation such as tone, intonation and stress. The word vowel comes from the Latin word vocalis, meaning "vocal" (i.e. relating to the voice).

Language processing in the brain

In psycholinguistics, language processing refers to the way humans use words to communicate ideas and feelings, and how such communications are processed and understood. Language processing is considered to be a uniquely human ability that is not produced with the same grammatical understanding or systematicity in even human's closest primate relatives. Throughout the 20th century the dominant model for language processing in the brain was the Geschwind-Lichteim-Wernicke model, which is based primarily on the analysis of brain-damaged patients.

Voiceless glottal fricative

The voiceless glottal fricative, sometimes called voiceless glottal transition or the aspirate, is a type of sound used in some spoken languages that patterns like a fricative or approximant consonant phonologically, but often lacks the usual phonetic characteristics of a consonant. The symbol in the International Phonetic Alphabet that represents this sound is h, and the equivalent X-SAMPA symbol is h.

Acoustic phonetics

Acoustic phonetics is a subfield of phonetics, which deals with acoustic aspects of speech sounds. Acoustic phonetics investigates time domain features such as the mean squared amplitude of a waveform, its duration, its fundamental frequency, or frequency domain features such as the frequency spectrum, or even combined spectrotemporal features and the relationship of these properties to other branches of phonetics (e.g. articulatory or auditory phonetics), and to abstract linguistic concepts such as phonemes, phrases, or utterances.

Compound probability distribution

In probability and statistics, a compound probability distribution (also known as a mixture distribution or contagious distribution) is the probability distribution that results from assuming that a random variable is distributed according to some parametrized distribution, with (some of) the parameters of that distribution themselves being random variables. If the parameter is a scale parameter, the resulting mixture is also called a scale mixture.

Vector quantization

Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by dividing a large set of points (vectors) into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms.

Sumerian language

Sumerian (Cuneiform: "native tongue") is the language of ancient Sumer. It is one of the oldest attested languages, dating back to at least 2900 BC. It is accepted to be a local language isolate and to have been spoken in ancient Mesopotamia, in the area that is modern-day Iraq. Akkadian, a Semitic language, gradually replaced Sumerian as a spoken language in the area 2000 BC (the exact date is debated), but Sumerian continued to be used as a sacred, ceremonial, literary and scientific language in Akkadian-speaking Mesopotamian states such as Assyria and Babylonia until the 1st century AD.