Computational linguistics has since 2020s became a near-synonym of either natural language processing or language technology, with deep learning approaches, such as large language models, overperforming the specific approaches previously used in the field.
The field overlapped with artificial intelligence since the efforts in the United States in the 1950s to use computers to automatically translate texts from foreign languages, particularly Russian scientific journals, into English. Since rule-based approaches were able to make arithmetic (systematic) calculations much faster and more accurately than humans, it was expected that lexicon, morphology, syntax and semantics can be learned using explicit rules, as well. After the failure of rule-based approaches, David Hays coined the term in order to distinguish the field from AI and co-founded both the Association for Computational Linguistics (ACL) and the International Committee on Computational Linguistics (ICCL) in the 1970s and 1980s. What started as an effort to translate between languages evolved into a much wider field of natural language processing.
In order to be able to meticulously study the English language, an annotated text corpus was much needed. The Penn Treebank was one of the most used corpora. It consisted of IBM computer manuals, transcribed telephone conversations, and other texts, together containing over 4.5 million words of American English, annotated using both part-of-speech tagging and syntactic bracketing.
Japanese sentence corpora were analyzed and a pattern of log-normality was found in relation to sentence length.
The fact that during language acquisition, children are largely only exposed to positive evidence, meaning that the only evidence for what is a correct form is provided, and no evidence for what is not correct, was a limitation for the models at the time because the now available deep learning models were not available in late 1980s.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
The Human Language Technology (HLT) course introduces methods and applications for language processing and generation, using statistical learning and neural networks.
The Deep Learning for NLP course provides an overview of neural network based methods applied to text. The focus is on models particularly suited to the properties of human language, such as categori
This course integrates knowledge in basic, systems, clinical and computational neuroscience, and engineering with the goal of translating this integrated knowledge into the development of novel method
Natural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis.
Machine translation is use of either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches to translation of text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages. History of machine translation The origins of machine translation can be traced back to the work of Al-Kindi, a ninth-century Arabic cryptographer who developed techniques for systemic language translation, including cryptanalysis, frequency analysis, and probability and statistics, which are used in modern machine translation.
This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.
This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.
Explores the Vector Space model, Bag of Words, tf-idf, cosine similarity, Okapi BM25, and Precision and Recall in Information Retrieval.
Delves into the processing of large digital text collections, exploring hidden regularities, text reuse, and TF-IDF analysis.
Explores neural interfaces, merging Engineering, Neuroscience, and Computer Science.
The archive of science is a place where scientific practices are sedimented in the form of drafts, protocols of rejected hypotheses and failed experiments, obsolete instruments, outdated visualizations and other residues. Today, just as science goes more a ...
As big strides were being made in many science fields in the 1970s and 80s, faster computation for solving problems in molecular biology, semiconductor technology, aeronautics, particle physics, etc., was at the forefront of research. Parallel and super-co ...
Human perceptual development evolves in a stereotyped fashion, with initially limited perceptual capabilities maturing over the months or years following the commencement of sensory experience into robust proficiencies. This review focuses on the functiona ...