Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Mass digitization and the opening of digital libraries gave access to a huge amount of historical newspapers. In order to bring structure into these documents, current techniques generally proceed in two distinct steps. First, they segment the digitized images into generic articles and then classify the text of the articles into finer-grained categories. Unfortunately, by losing the link between layout and text, these two steps are not able to account for the fact that newspaper content items have distinctive visual features. This project proposes two main novelties. Firstly, it introduces the idea of merging the segmentation and classification steps, resulting in a fine- grained semantic segmentation of newspapers images. Secondly, it proposes to use textual features under the form of embeddings maps at segmentation step. The semantic segmentation with four categories (feuilleton, weather forecast, obituary, and stock exchange table) is done using a fully convolutional neural network and reaches a mIoU of 79.3%. The introduction of embeddings maps improves the overall performances by 3% and the generalization across time and newspapers by 8% and 12%, respectively. This shows a strong potential to consider the semantic aspect in the segmentation of newspapers and to use textual features to improve generalization.
Devis Tuia, Valérie Zermatten, Javiera Francisca Castillo Navarro, Lloyd Haydn Hughes