Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture discusses the complexity of describing document content and the development of new open standards in the last decade to address this issue. It explores the regulated representation approach to create generic descriptions of document information structure, the extraction of document content, and the construction of world models. The lecture also covers the general structure of the document pipeline, challenges in modeling document structure and content, as well as the Open Annotation Model and its data model. It delves into the concepts of modeling content, circulations, dimensions, and homologous pairs in document analysis.