Lecture

Text Encoding: Unicode and XML

Description

This lecture covers the evolution of text encoding, from ASCII to Unicode, which allows the representation of various writing systems. It delves into the challenges of encoding multilingual texts and the development of XML as a hierarchical text representation. The discussion extends to the Text Encoding Initiative (TEI) and the complexities of encoding structured texts. Additionally, it explores the use of OCR, HTR, and crowdsourcing for text corrections, as well as the handling of spelling variations in historical texts.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.