Document classificationDocument classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") or algorithmically. The intellectual classification of documents has mostly been the province of library science, while the algorithmic classification of documents is mainly in information science and computer science.
LanguageLanguage is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and written forms, and may also be conveyed through sign languages. The vast majority of human languages have developed writing systems that allow for the recording and preservation of the sounds or signs of language. Human language is characterized by its cultural and historical diversity, with significant variations observed between cultures and across time.
Naive Bayes classifierIn statistics, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Bayesian network models, but coupled with kernel density estimation, they can achieve high accuracy levels. Naive Bayes classifiers are highly scalable, requiring a number of parameters linear in the number of variables (features/predictors) in a learning problem.
Second languageA second language (L2) is a language spoken in addition to one's first language (L1). A second language may be a neighbouring language, another language of the speaker's home country, or a foreign language. A speaker's dominant language, which is the language a speaker uses most or is most comfortable with, is not necessarily the speaker's first language. For example, the Canadian census defines first language for its purposes as "the first language learned in childhood and still spoken", recognizing that for some, the earliest language may be lost, a process known as language attrition.
Language acquisitionLanguage acquisition is the process by which humans acquire the capacity to perceive and comprehend language (in other words, gain the ability to be aware of language and to understand it), as well as to produce and use words and sentences to communicate. Language acquisition involves structures, rules, and representation. The capacity to use language successfully requires one to acquire a range of tools including phonology, morphology, syntax, semantics, and an extensive vocabulary.