Optical character recognitionOptical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of s of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast).
Handwriting recognitionHandwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices. The image of the written text may be sensed "off line" from a piece of paper by optical scanning (optical character recognition) or intelligent word recognition. Alternatively, the movements of the pen tip may be sensed "on line", for example by a pen-based computer screen surface, a generally easier task as there are more clues available.
Chinese charactersChinese characters are logograms developed for the writing of Chinese. Chinese characters are the oldest continuously used system of writing in the world. By virtue of their widespread current use throughout East Asia and Southeast Asia, as well as their profound historic use throughout the Sinosphere, Chinese characters are among the most widely adopted writing systems in the world by number of users. The total number of Chinese characters ever to appear in a dictionary is in the tens of thousands, though most are graphic variants, were used historically and passed out of use, or are of a specialized nature.
Pattern recognitionPattern recognition is the automated recognition of patterns and regularities in data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess (PR) capabilities but their primary function is to distinguish and create emergent pattern. PR has applications in statistical data analysis, signal processing, , information retrieval, bioinformatics, data compression, computer graphics and machine learning.
PenmanshipPenmanship is the technique of writing with the hand using a writing instrument. Today, this is most commonly done with a pen, or pencil, but throughout history has included many different implements. The various generic and formal historical styles of writing are called "hands" while an individual's style of penmanship is referred to as "handwriting". The earliest example of systematic writing is the Sumerian pictographic system found on clay tablets, which eventually developed around 3200 BC into a modified version called cuneiform which was impressed on wet clay with a sharpened reed.
Chinese character classificationAll Chinese characters are logograms, but several different types can be identified, based on the manner in which they are formed or derived. There are a handful which derive from pictographs () and a number which are ideographic () in origin, including compound ideographs (), but the vast majority originated as phono-semantic compounds (). The other categories in the traditional system of classification are rebus or phonetic loan characters () and "derivative cognates" ().
Speech recognitionSpeech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis.
Glagolitic scriptThe Glagolitic script (ˌɡlæɡəˈlɪtɪk, , glagolitsa) is the oldest known Slavic alphabet. It is generally agreed to have been created in the 9th century by Saint Cyril, a monk from Thessalonica. He and his brother Saint Methodius were sent by the Byzantine Emperor Michael III in 863 to Great Moravia to spread Christianity among the West Slavs in the area. The brothers decided to translate liturgical books into the contemporary Slavic language understandable to the general population (now known as Old Church Slavonic).
Camel caseCamel case (sometimes stylized as camelCase or CamelCase, also known as camel caps or more formally as medial capitals) is the practice of writing phrases without spaces or punctuation and with capitalized words. The format indicates the first word starting with either case, then the following words having an initial uppercase letter. Common examples include "YouTube", "iPhone" and "eBay". Camel case is often used as a naming convention in computer programming.
Statistical classificationIn statistics, classification is the problem of identifying which of a set of categories (sub-populations) an observation (or observations) belongs to. Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagnosis to a given patient based on observed characteristics of the patient (sex, blood pressure, presence or absence of certain symptoms, etc.). Often, the individual observations are analyzed into a set of quantifiable properties, known variously as explanatory variables or features.