In natural language processing, semantic role labeling (also called shallow semantic parsing or slot-filling) is the process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.
It serves to find the meaning of the sentence. To do this, it detects the arguments associated with the predicate or verb of a sentence and how they are classified into their specific roles. A common example is the sentence "Mary sold the book to John." The agent is "Mary," the predicate is "sold" (or rather, "to sell,") the theme is "the book," and the recipient is "John." Another example is how "the book belongs to me" would need two labels such as "possessed" and "possessor" and "the book was sold to John" would need two other labels such as theme and recipient, despite these two clauses being similar to "subject" and "object" functions.
In 1968, the first idea for semantic role labeling was proposed by Charles J. Fillmore. His proposal led to the FrameNet project which produced the first major computational lexicon that systematically described many predicates and their corresponding roles. Daniel Gildea (Currently at University of Rochester, previously University of California, Berkeley / International Computer Science Institute) and Daniel Jurafsky (currently teaching at Stanford University, but previously working at University of Colorado and UC Berkeley) developed the first automatic semantic role labeling system based on FrameNet. The PropBank corpus added manually created semantic role annotations to the Penn Treebank corpus of Wall Street Journal texts. Many automatic semantic role labeling systems have used PropBank as a training dataset to learn how to annotate new sentences automatically.
Semantic role labeling is mostly used for machines to understand the roles of words within sentences. This benefits applications similar to Natural Language Processing programs that need to understand not just the words of languages, but how they can be used in varying sentences.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Knowledge extraction is the creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, s) sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL (data warehouse), the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema.
Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video/documents could be seen as information extraction Due to the difficulty of the problem, current approaches to IE (as of 2010) focus on narrowly restricted domains.
Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc. Most research on NER/NEE systems has been structured as taking an unannotated block of text, such as this one: Jim bought 300 shares of Acme Corp.
The Deep Learning for NLP course provides an overview of neural network based methods applied to text. The focus is on models particularly suited to the properties of human language, such as categori
The objective of this course is to present the main models, formalisms and algorithms necessary for the development of applications in the field of natural language information processing. The concept
The Python library ms3 makes scores (symbolic representations of music) operational for computational approaches by representing their contents as sets of tabular files. Music scores represent relations between sounding events by graphical means. The Free ...
Natural language processing has experienced significant improvements with the development of Transformer-based models, which employ self-attention mechanism and pre-training strategies. However, these models still present several obstacles. A notable issue ...
Supervised machine learning models are receiving increasing attention in electricity theft detection due to their high detection accuracy. However, their performance depends on a massive amount of labeled training data, which comes from time-consuming and ...
Introduces Natural Language Processing (NLP) and its applications, covering tokenization, machine learning, sentiment analysis, and Swiss NLP applications.