WordNet is a lexical database of semantic relations between words that links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into synsets with short definitions and usage examples. It can thus be seen as a combination and extension of a dictionary and thesaurus. While it is accessible to human users via a web browser, its primary use is in automatic text analysis and artificial intelligence applications. It was first created in the English language and the English WordNet database and software tools have been released under a BSD style license and are freely available for download from that WordNet website. There are now WordNets in more than 200 languages.
WordNet was first created in 1985, in English only, in the Cognitive Science Laboratory of Princeton University under the direction of psychology professor George Armitage Miller. It was later directed by Christiane Fellbaum. The project was initially funded by the U.S. Office of Naval Research, and later also by other U.S. government agencies including the DARPA, the National Science Foundation, the Disruptive Technology Office (formerly the Advanced Research and Development Activity) and REFLEX. George Miller and Christiane Fellbaum received the 2006 Antonio Zampolli Prize for their work with WordNet.
The Global WordNet Association is a non-commercial organization that provides a platform for discussing, sharing and connecting WordNets for all languages in the world. Christiane Fellbaum and Piek Th.J.M. Vossen are its co-presidents.
The database contains 155,327 words organized in 175,979 synsets for a total of 207,016 word-sense pairs; in compressed form, it is about 12 megabytes in size.
It includes the lexical categories nouns, verbs, adjectives and adverbs but ignores prepositions, determiners and other function words.
Words from the same lexical category that are roughly synonymous are grouped into synsets, which include simplex words as well as collocations like "eat out" and "car pool.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
This course introduces the foundations of information retrieval, data mining and knowledge bases, which constitute the foundations of today's Web-based distributed information systems.
A thesaurus (: thesauri or thesauruses), sometimes called a synonym dictionary or dictionary of synonyms, is a reference work which arranges words by their meanings (or in simpler terms, a book where you can find different words with same meanings to other words), sometimes as a hierarchy of broader and narrower terms, sometimes simply as lists of synonyms and antonyms. They are often used by writers to help find the best word to express an idea: to find the word, or words, by which [an] idea may be most fitly and aptly expressed Synonym dictionaries have a long history.
In information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of concepts and categories that represent the subject. Every academic discipline or field creates ontologies to limit complexity and organize data into information and knowledge.
A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields. A semantic network may be instantiated as, for example, a graph database or a concept map. Typical standardized semantic networks are expressed as semantic triples.
Explores lexical semantics, word sense, semantic relations, and WordNet, highlighting applications in language engineering and information retrieval.
Explores methods for information extraction, including traditional and embedding-based approaches, supervised learning, distant supervision, and taxonomy induction.
Explores taxonomy induction, learning terms and relationships to construct hierarchical structures.
A dialogue is successful when there is alignment between the speakers, at different linguistic levels. In this work, we consider the dialogue occurring between interlocutors engaged in a collaborative learning task, and explore how performance and learning ...
Voice communication is the main channel to exchange information between pilots and Air-Traffic Controllers (ATCos). Recently, several projects have explored the employment of speech recognition technology to automatically extract spoken key information suc ...
We present a framework for building unsupervised representations of entities and their compositions, where each entity is viewed as a probability distribution rather than a vector embedding. In particular, this distribution is supported over the contexts w ...