Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This paper focuses on the crowd-annotation of an ancient Maya glyph dataset derived from the three ancient codices that survived up to date. More precisely, non-expert annotators are asked to segment glyph-blocks into their constituent glyph entities. As a means of supervision, available glyph variants are provided to the annotators during the crowdsourcing task. Compared to object recognition in natural images or handwriting transcription tasks, designing an engaging task and dealing with crowd behavior is challenging in our case. This challenge originates from the inherent complexity of Maya writing and an incomplete understanding of the signs and semantics in the existing catalogs. We elaborate on the evolution of the crowdsourcing task design, and discuss the choices for providing supervision during the task. We analyze the distributions of similarity and task difficulty scores, and the segmentation performance of the crowd. A unique dataset of over 9000 Maya glyphs from 291 categories individually segmented from the three codices was created and will be made publicly available thanks to this process. This dataset lends itself to automatic glyph classification tasks. We provide baseline methods for glyph classification using traditional shape descriptors and convolutional neural networks.
Pascal Fua, Nikita Durasov, Doruk Oner, Minh Hieu Lê