For long held on library and archive shelving, historical newspapers are currently undergoing mass digitization and millions of facsimiles, along with their machine-readable content acquired via Optical Character Recognition, are becoming accessible via a variety of online portals. If this represents a major step forward in terms of preservation of and access to documents, much remains to be done in order to provide an extensive and sophisticated access to the content of these digital resources. We believe that the promise of newspaper digitization lies in their semantic indexation, closely tied with the development of co-designed interfaces that accommodate text analysis research tools and their usage by humanities scholars. How to go beyond keyword search? How to explore complex and vast amounts of data? Based on the on-going project ‘impresso - Media Monitoring of the Past’, in this talk I will present our interdisciplinary approach and share hands-on experience in going from facsimiles to enhanced search and visualization capacities supporting historical research.
Simon François Dumas Primbault
Jiyong Zhang, Yue Wang, Hongyu Yang