Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers the evolution of text encoding, from ASCII to Unicode, which allows the representation of text in various languages and scripts. It delves into the details of Unicode standards, the challenges of multilingual word processing, and the role of XML in structuring textual data. The lecture also explores the complexities of encoding text hierarchically, the use of TEI for text annotation, and the advancements in Optical Character Recognition and Handwritten Text Recognition technologies.