German languageGerman (Standard High German: Deutsch, dɔʏtʃ) is a West Germanic language mainly spoken in Western Europe and Central Europe. It is the most widely spoken and official or co-official language in Germany, Austria, Switzerland, Liechtenstein, and the Italian province of South Tyrol. It is also an official language of Luxembourg and Belgium, as well as a recognized national language in Namibia. Outside Germany, it is also spoken by German communities in France (Alsace), Czech Republic (North Bohemia), Poland (Upper Silesia), Slovakia (Košice Region, Spiš, and Hauerland), and Hungary (Sopron).
Automatic summarizationAutomatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence algorithms are commonly developed and employed to achieve this, specialized for different types of data. Text summarization is usually implemented by natural language processing methods, designed to locate the most informative sentences in a given document.
German studiesGerman studies is the field of humanities that researches, documents and disseminates German language and literature in both its historic and present forms. Academic departments of German studies often include classes on German culture, German history, and German politics in addition to the language and literature component. Common German names for the field are Germanistik, Deutsche Philologie, and Deutsche Sprachwissenschaft und Literaturwissenschaft.
German dialectsGerman dialects are the various traditional local varieties of the German language. Though varied by region, those of the southern half of Germany beneath the Benrath line are dominated by the geographical spread of the High German consonant shift, and the dialect continuum that connects German to the neighboring varieties of Low Franconian (Dutch) and Frisian. The varieties of German are conventionally grouped into Upper German, Central German and Low German; Upper and Central German form the High German subgroup.
High German languagesThe High German languages (hochdeutsche Mundarten, i.e. High German dialects), or simply High German (Hochdeutsch) – not to be confused with Standard High German which is commonly also called "High German" – comprise the varieties of German spoken south of the Benrath and Uerdingen isoglosses in central and southern Germany, Austria, Liechtenstein, Switzerland, Luxembourg, and eastern Belgium, as well as in neighbouring portions of France (Alsace and northern Lorraine), Italy (South Tyrol), the Czech Republic (Bohemia), and Poland (Upper Silesia).
Standard GermanStandard High German (SHG), less precisely Standard German, or High German (Standardhochdeutsch, Standarddeutsch, Hochdeutsch or, in Switzerland, Schriftdeutsch) while referring to its regional origins and not to be confused with High German dialects, is the standardized variety of the German language used in formal contexts and for communication between different dialect areas. It is a pluricentric Dachsprache with three codified (or standardised) specific regional variants: German Standard German, Austrian Standard German and Swiss Standard German.
Text miningText mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources." Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. According to Hotho et al.
German diasporaThe German diaspora consists of German people and their descendants who live outside of Germany. The term is used in particular to refer to the aspects of migration of German speakers from Central Europe to different countries around the world. This definition describes the "German" term as a sociolinguistic group as opposed to the national one since the emigrant groups came from different regions with diverse cultural practices and different varieties of German.
SemEvalSemEval (Semantic Evaluation) is an ongoing series of evaluations of computational semantic analysis systems; it evolved from the Senseval word sense evaluation series. The evaluations are intended to explore the nature of meaning in language. While meaning is intuitive to humans, transferring those intuitions to computational analysis has proved elusive. This series of evaluations is providing a mechanism to characterize in more precise terms exactly what is necessary to compute in meaning.
Natural language processingNatural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.