Sentence (linguistics)In linguistics and grammar, a sentence is a linguistic expression, such as the English example "The quick brown fox jumps over the lazy dog." In traditional grammar it is typically defined as a string of words that expresses a complete thought, or as a unit consisting of a subject and predicate. In non-functional linguistics it is typically defined as a maximal unit of syntactic structure such as a constituent.
Word embeddingIn natural language processing (NLP), a word embedding is a representation of a word. The embedding is used in text analysis. Typically, the representation is a real-valued vector that encodes the meaning of the word in such a way that words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using language modeling and feature learning techniques, where words or phrases from the vocabulary are mapped to vectors of real numbers.
Word2vecWord2vec is a technique for natural language processing (NLP) published in 2013. The word2vec algorithm uses a neural network model to learn word associations from a large corpus of text. Once trained, such a model can detect synonymous words or suggest additional words for a partial sentence. As the name implies, word2vec represents each distinct word with a particular list of numbers called a vector.
Semantic networkA semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields. A semantic network may be instantiated as, for example, a graph database or a concept map. Typical standardized semantic networks are expressed as semantic triples.
Ontology componentsContemporary ontologies share many structural similarities, regardless of the ontology language in which they are expressed. Most ontologies describe individuals (instances), classes (concepts), attributes, and relations. Common components of ontologies include: Individuals instances or objects (the basic or "ground level" objects; the tokens). Classes sets, collections, concepts, types of objects, or kinds of things. Attributes aspects, properties, features, characteristics, or parameters that objects (and classes) can have.
Semantic similaritySemantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tools used to estimate the strength of the semantic relationship between units of language, concepts or instances, through a numerical description obtained according to the comparison of information supporting their meaning or describing their nature.
Sentence clause structureIn grammar, sentence and clause structure, commonly known as sentence composition, is the classification of sentences based on the number and kind of clauses in their syntactic structure. Such division is an element of traditional grammar. In standard English, sentences are composed of five clause patterns: Subject + Verb (intransitive)Example: She runs. Subject + Verb (transitive) + ObjectExample: She runs the meeting. Subject + Verb (linking) + Subject Complement (adjective, noun, pronoun)Example: Abdul is happy.
Knowledge extractionKnowledge extraction is the creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, s) sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL (data warehouse), the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema.
Entity linkingIn natural language processing, entity linking, also referred to as named-entity linking (NEL), named-entity disambiguation (NED), named-entity recognition and disambiguation (NERD) or named-entity normalization (NEN) is the task of assigning a unique identity to entities (such as famous individuals, locations, or companies) mentioned in text. For example, given the sentence "Paris is the capital of France", the idea is to determine that "Paris" refers to the city of Paris and not to Paris Hilton or any other entity that could be referred to as "Paris".
Elementary equivalenceIn model theory, a branch of mathematical logic, two structures M and N of the same signature σ are called elementarily equivalent if they satisfy the same first-order σ-sentences. If N is a substructure of M, one often needs a stronger condition. In this case N is called an elementary substructure of M if every first-order σ-formula φ(a1, ..., an) with parameters a1, ..., an from N is true in N if and only if it is true in M. If N is an elementary substructure of M, then M is called an elementary extension of N.