Tertiary sourceA tertiary source is an index or textual consolidation of already published primary and secondary sources that does not provide additional interpretations or analysis of the sources. Some tertiary sources can be used as an aid to find key (seminal) sources, key terms, general common knowledge and established mainstream science on a topic. The exact definition of tertiary varies by academic field. Academic research standards generally do not accept tertiary sources such as encyclopedias as citations, although survey articles are frequently cited rather than the original publication.
Natural language processingNatural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.
Secondary sourceIn scholarship, a secondary source is a document or recording that relates or discusses information originally presented elsewhere. A secondary source contrasts with a primary source, which is an original source of the information being discussed; a primary source can be a person with direct knowledge of a situation or a document created by such a person. A secondary source is one that gives information about a primary source. In this source, the original information is selected, modified and arranged in a suitable format.
VeniceVenice (ˈvɛnᵻs ; Venezia veˈnɛttsja; Venesia veˈnɛsja, outdatedly Venexia veˈnɛzja) is a city in northeastern Italy and the capital of the Veneto region. It is built on a group of 118 small islands that are separated by expanses of open water and by canals; portions of the city are linked by over 400 bridges. The islands are in the shallow Venetian Lagoon, an enclosed bay lying between the mouths of the Po and the Piave rivers (more exactly between the Brenta and the Sile).
Historical sourceHistorical source is an original source that contains important historical information. These sources are something that inform us about history at the most basic level, and are used as clues in order to study history. Historical sources can include coins, artefacts, monuments, literary sources, documents, artifacts, archaeological sites, features, oral transmissions, stone inscriptions, paintings, recorded sounds, images and oral history. Even ancient relics and ruins, broadly speaking, are historical sources.
TreebankIn linguistics, a treebank is a parsed text corpus that annotates syntactic or semantic sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which benefitted from large-scale empirical data. The term treebank was coined by linguist Geoffrey Leech in the 1980s, by analogy to other repositories such as a seedbank or bloodbank. This is because both syntactic and semantic structure are commonly represented compositionally as a tree structure.
Conditional random fieldConditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without considering "neighbouring" samples, a CRF can take context into account. To do so, the predictions are modelled as a graphical model, which represents the presence of dependencies between the predictions. What kind of graph is used depends on the application.
GitHubGitHub, Inc. (ˈgɪthʌb) is a platform and cloud-based service for software development and version control using Git, allowing developers to store and manage their code. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous integration, and wikis for every project. Headquartered in California, it has been a subsidiary of Microsoft since 2018. It is commonly used to host open source software development projects.
Statistical machine translationStatistical machine translation (SMT) was a machine translation approach, that superseded the previous, rule-based approach because it required explicit description of each and every linguistic rule, which was costly, and which often did not generalize to other languages. Since 2003, the statistical approach itself has been gradually superseded by the deep learning-based neural network approach. The first ideas of statistical machine translation were introduced by Warren Weaver in 1949, including the ideas of applying Claude Shannon's information theory.
Spatial reference systemA spatial reference system (SRS) or coordinate reference system (CRS) is a framework used to precisely measure locations on the surface of Earth as coordinates. It is thus the application of the abstract mathematics of coordinate systems and analytic geometry to geographic space. A particular SRS specification (for example, "Universal Transverse Mercator WGS 84 Zone 16N") comprises a choice of Earth ellipsoid, horizontal datum, map projection (except in the geographic coordinate system), origin point, and unit of measure.