WikipediaWikipedia is a free-content online encyclopedia written and maintained by a community of volunteers, collectively known as Wikipedians, through open collaboration and using a wiki-based editing system called MediaWiki. Wikipedia is the largest and most-read reference work in history, and has consistently been one of the 10 most popular websites. Created by Jimmy Wales and Larry Sanger on January 15, 2001, it is hosted by the Wikimedia Foundation, an American nonprofit organization.
Reliability of WikipediaThe reliability of Wikipedia concerns the validity, verifiability, and veracity of Wikipedia and its user-generated editing model, particularly its English-language edition. It is written and edited by volunteer editors who generate online content with the editorial oversight of other volunteer editors via community-generated policies and guidelines. This editing model is highly concentrated, as 77% of all articles are written by 1% of its editors, a majority of whom have chosen to remain anonymous.
Criticism of WikipediaMost criticism of Wikipedia has been directed toward its content, community of established users, and processes. Critics have questioned its factual reliability, the readability and organization of the articles, the lack of methodical fact-checking, and its political bias. Concerns have also been raised about systemic bias along gender, racial, political, corporate, institutional, and national lines. In addition, conflicts of interest arising from corporate campaigns to influence content have also been highlighted.
Knowledge extractionKnowledge extraction is the creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, s) sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL (data warehouse), the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema.
Terminology extractionTerminology extraction (also known as term extraction, glossary extraction, term recognition, or terminology mining) is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a given corpus. In the semantic web era, a growing number of communities and networked enterprises started to access and interoperate through the internet. Modeling these communities and their information needs is important for several web applications, like topic-driven web crawlers, web services, recommender systems, etc.