Summary
Unicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, thousands of emoji (including in colours), and non-visual control and formatting codes. Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, JSON, and most modern programming languages, sometimes only in UTF-8 form. The Unicode character repertoire is synchronized with ISO/IEC 10646, each being code-for-code identical to the other. The Unicode Standard, however, includes more than just the base code. Alongside the character encodings, the Consortium's official publication includes a wide variety of details about the scripts and how to display them: normalization rules, decomposition, collation, rendering, and bidirectional text display order for multilingual texts, and so on. The Standard also includes reference data files and visual charts to help developers and designers correctly implement the repertoire. Unicode can be stored using several different encodings, which translate the character codes into sequences of bytes. The Unicode Standard defines three encodings but several others exist, mostly variable-length encodings. The most common encodings are the ASCII-compatible UTF-8, the ASCII-incompatible UTF-16 (compatible with the obsolete UCS-2), and the Chinese Unicode encoding standard GB18030 which is not part of The Unicode Standard but is used in China and implements Unicode fully. Unicode has the explicit aim of transcending the limitations of traditional character encodings, such as those defined by the ISO/IEC 8859 standard, which find wide usage in various countries of the world but remain largely incompatible with each other.
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (32)
ME-213: Programmation pour ingénieur
Mettre en pratique les bases de la programmation vues au semestre précédent. Développer un logiciel structuré. Méthode de debug d'un logiciel. Introduction à la programmation scientifique. Introductio
FIN-406: Macrofinance
This course provides students with a working knowledge of macroeconomic models that explicitly incorporate financial markets. The goal is to develop a broad and analytical framework for analyzing the
ENG-270: Computational methods and tools
This course prepares students to use modern computational methods and tools for solving problems in engineering and science.
Show more
Related publications (17)

Improving Inter-Laboratory Reproducibility in Measurement of Biochemical Methane Potential (BMP)

Christof Holliger

Biochemical methane potential (BMP) tests used to determine the ultimate methane yield of organic substrates are not sufficiently standardized to ensure reproducibility among laboratories. In this contribution, a standardized BMP protocol was tested in a l ...
2020

ODIANLP's Participation in WAT2020

Petr Motlicek

This paper describes the team (“ODI-ANLP”)’s submission to WAT 2020. We have participated in the English→HindiMultimodal task and Indic task. We have used the state-of-the-art Transformer model for the translation task and Incep-tionResNetV2 for the Hindi ...
ACL2020

Language Resources for Historical Newspapers: the Impresso Collection

Maud Ehrmann, Matteo Romanello, Raphaël Barman

Following decades of massive digitization, an unprecedented amount of historical document facsimiles can now be retrieved and accessed via cultural heritage online portals. If this represents a huge step forward in terms of preservation and accessibility, ...
European Language Resources Association2020
Show more