Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
Comic book digitization would play a pivotal role in exploring new avenues on how digital comics can be consumed. As of present, the systems capable of doing such a task are limited in capability to achieve complete digitization. This task of digitization requires the understanding of the content within comic books, which can be drawn from sub-tasks such as identification and extraction of comic book content, extraction and analysis of texts, derivation of character-speech balloon associations and analysis of reading styles. In this paper, first, an analysis of the usage of several object detection models for detecting semantic elements is presented. Under the constraint of limited computational power, this analysis revealed that YOLOv3 was the most suited out of the models evaluated. Then, a particular focus is given to the analysis of extraction and recognition of texts utilizing Optical Character Recognition, along with distance-based methods for deriving associable speech balloons as well as character and speech balloon associations under given constraints. The presented association method gave an improved accuracy relative to the Euclidean distance-based method. Finally, an analysis of comic styles is presented along with a learning model to determine the reading order of comics with an accuracy of 0.89.
Sabine Süsstrunk, Mathieu Salzmann, Deblina Bhattacharjee