Natural language processingNatural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.
Markov propertyIn probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process, which means that its future evolution is independent of its history. It is named after the Russian mathematician Andrey Markov. The term strong Markov property is similar to the Markov property, except that the meaning of "present" is defined in terms of a random variable known as a stopping time. The term Markov assumption is used to describe a model where the Markov property is assumed to hold, such as a hidden Markov model.
Foreign languageA foreign language is a language that is not an official language of, nor typically spoken in, a specific country. Native speakers from that country usually need to acquire it through conscious learning, such as through language lessons at school, self-teaching, or attending language courses. A foreign language might be learned as a second language; however, there is a distinction between the two terms. A second language refers to a language that plays a significant role in the region where the speaker lives, whether for communication, education, business, or governance.
Markov chainA Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain (DTMC). A continuous-time process is called a continuous-time Markov chain (CTMC).
Computer-assisted language learningComputer-assisted language learning (CALL), British, or Computer-Aided Instruction (CAI)/Computer-Aided Language Instruction (CALI), American, is briefly defined in a seminal work by Levy (1997: p. 1) as "the search for and study of applications of the computer in language teaching and learning". CALL embraces a wide range of information and communications technology applications and approaches to teaching and learning foreign languages, from the "traditional" drill-and-practice programs that characterised CALL in the 1960s and 1970s to more recent manifestations of CALL, e.
Transformational grammarIn linguistics, transformational grammar (TG) or transformational-generative grammar (TGG) is part of the theory of generative grammar, especially of natural languages. It considers grammar to be a system of rules that generate exactly those combinations of words that form grammatical sentences in a given language and involves the use of defined operations (called transformations) to produce new sentences from existing ones. The method is commonly associated with American linguist Noam Chomsky.
Slavic languagesThe Slavic languages, also known as the Slavonic languages, are Indo-European languages spoken primarily by the Slavic peoples and their descendants. They are thought to descend from a proto-language called Proto-Slavic, spoken during the Early Middle Ages, which in turn is thought to have descended from the earlier Proto-Balto-Slavic language, linking the Slavic languages to the Baltic languages in a Balto-Slavic group within the Indo-European family.
Exponential distributionIn probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.
English as a second or foreign languageEnglish as a second or foreign language is the use of English by speakers with different native languages. Language education for people learning English may be known as English as a foreign language (EFL), English as a second language (ESL), English for speakers of other languages (ESOL), English as an additional language (EAL), or English as a New Language (ENL). The aspect in which EFL is taught is referred to as teaching English as a foreign language (TEFL), teaching English as a second language (TESL) or teaching English to speakers of other languages (TESOL).
Poisson distributionIn probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. It is named after French mathematician Siméon Denis Poisson ('pwɑːsɒn; pwasɔ̃). The Poisson distribution can also be used for the number of events in other specified interval types such as distance, area, or volume.