In natural language processing, semantic role labeling (also called shallow semantic parsing or slot-filling) is the process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.
It serves to find the meaning of the sentence. To do this, it detects the arguments associated with the predicate or verb of a sentence and how they are classified into their specific roles. A common example is the sentence "Mary sold the book to John." The agent is "Mary," the predicate is "sold" (or rather, "to sell,") the theme is "the book," and the recipient is "John." Another example is how "the book belongs to me" would need two labels such as "possessed" and "possessor" and "the book was sold to John" would need two other labels such as theme and recipient, despite these two clauses being similar to "subject" and "object" functions.
In 1968, the first idea for semantic role labeling was proposed by Charles J. Fillmore. His proposal led to the FrameNet project which produced the first major computational lexicon that systematically described many predicates and their corresponding roles. Daniel Gildea (Currently at University of Rochester, previously University of California, Berkeley / International Computer Science Institute) and Daniel Jurafsky (currently teaching at Stanford University, but previously working at University of Colorado and UC Berkeley) developed the first automatic semantic role labeling system based on FrameNet. The PropBank corpus added manually created semantic role annotations to the Penn Treebank corpus of Wall Street Journal texts. Many automatic semantic role labeling systems have used PropBank as a training dataset to learn how to annotate new sentences automatically.
Semantic role labeling is mostly used for machines to understand the roles of words within sentences. This benefits applications similar to Natural Language Processing programs that need to understand not just the words of languages, but how they can be used in varying sentences.
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.
Introduit le traitement du langage naturel (NLP) et ses applications, couvrant la tokenisation, l'apprentissage automatique, l'analyse du sentiment et les applications NLP suisses.
The Deep Learning for NLP course provides an overview of neural network based methods applied to text. The focus is on models particularly suited to the properties of human language, such as categori
The objective of this course is to present the main models, formalisms and algorithms necessary for the development of applications in the field of natural language information processing. The concept
L'extraction de connaissances est le processus de création de connaissances à partir d'informations structurées (bases de données relationnelles, XML) ou non structurées (textes, documents, images). Le résultat doit être dans un format lisible par les ordinateurs. Le groupe RDB2RDF W3C est en cours de standardisation d'un langage d'extraction de connaissances au format RDF à partir de bases de données. En français on parle d'« extraction de connaissances à partir des données » (ECD).
Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video/documents could be seen as information extraction Due to the difficulty of the problem, current approaches to IE (as of 2010) focus on narrowly restricted domains.
La reconnaissance d'entités nommées est une sous-tâche de l'activité d'extraction d'information dans des corpus documentaires. Elle consiste à rechercher des objets textuels (c'est-à-dire un mot, ou un groupe de mots) catégorisables dans des classes telles que noms de personnes, noms d'organisations ou d'entreprises, noms de lieux, quantités, distances, valeurs, dates, etc. À titre d'exemple, on pourrait donner le texte qui suit, étiqueté par un système de reconnaissance d'entités nommées utilisé lors de la campagne d'évaluation MUC: Henri a acheté 300 actions de la société AMD en 2006 Henri a acheté 300 actions de la société AMD en 2006.
Supervised machine learning models are receiving increasing attention in electricity theft detection due to their high detection accuracy. However, their performance depends on a massive amount of labeled training data, which comes from time-consuming and ...
Natural language processing has experienced significant improvements with the development of Transformer-based models, which employ self-attention mechanism and pre-training strategies. However, these models still present several obstacles. A notable issue ...
EPFL2023
,
The Python library ms3 makes scores (symbolic representations of music) operational for computational approaches by representing their contents as sets of tabular files. Music scores represent relations between sounding events by graphical means. The Free ...