**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Lecture# Tagging (a.k.a. Sequence labeling)

Description

This lecture covers the concepts of lemmatization, PoS tagging, sequence labeling, and probabilistic PoS tagging using Hidden Markov Models (HMMs). It explains the formalization, algorithms, and parameter estimation of HMMs, along with other approaches and performance evaluation in the context of Part-of-Speech (PoS) tagging.

Login to watch the video

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

In course

Instructors (2)

CS-431: Introduction to natural language processing

The objective of this course is to present the main models, formalisms and algorithms necessary for the development of applications in the field of natural language information processing. The concept

Related concepts (38)

Related lectures (2)

Part-of-speech tagging

In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST), also called grammatical tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its context. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc. Once performed by hand, POS tagging is now done in the context of computational linguistics, using algorithms which associate discrete terms, as well as hidden parts of speech, by a set of descriptive tags.

Markov property

In probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process, which means that its future evolution is independent of its history. It is named after the Russian mathematician Andrey Markov. The term strong Markov property is similar to the Markov property, except that the meaning of "present" is defined in terms of a random variable known as a stopping time. The term Markov assumption is used to describe a model where the Markov property is assumed to hold, such as a hidden Markov model.

Hidden Markov model

A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it — with unobservable ("hidden") states. As part of the definition, HMM requires that there be an observable process whose outcomes are "influenced" by the outcomes of in a known way.

Markov information source

In mathematics, a Markov information source, or simply, a Markov source, is an information source whose underlying dynamics are given by a stationary finite Markov chain. An information source is a sequence of random variables ranging over a finite alphabet , having a stationary distribution. A Markov information source is then a (stationary) Markov chain , together with a function that maps states in the Markov chain to letters in the alphabet .

Speech recognition

Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis.

Introduces Hidden Markov Models, explaining the basic problems and algorithms like Forward-Backward, Viterbi, and Baum-Welch, with a focus on Expectation-Maximization.

Explores Part-of-Speech tagging using probabilistic models like Hidden Markov Models and discusses the resolution of lexical ambiguities.