Publication

Lausanne Historical Censuses Dataset HTR 35k

Related publications (36)

Subjective performance evaluation of bitrate allocation strategies for MPEG and JPEG Pleno point cloud compression

Touradj Ebrahimi, Michela Testolina, Davi Nachtigall Lazzarotto

The recent rise in interest in point clouds as an imaging modality has motivated standardization groups such as JPEG and MPEG to launch activities aiming at developing compression standards for point clouds. Lossy compression usually introduces visual arti ...
Springer2024

Post-correction of Historical Text Transcripts with Large Language Models: An Exploratory Study

Frédéric Kaplan, Maud Ehrmann, Matteo Romanello, Sven-Nicolas Yoann Najem, Emanuela Boros

The quality of automatic transcription of heritage documents, whether from printed, manuscripts or audio sources, has a decisive impact on the ability to search and process historical texts. Although significant progress has been made in text recognition ( ...
The Association for Computational Linguistics2024

1805-1898 Census Records of Lausanne : a Long Digital Dataset for Demographic History

Isabella Di Lenardo, Lucas Arnaud André Rappo, Rémi Guillaume Petitpierre, Marion Kramer

This historical dataset stems from the project of automatic extraction of 72 census records of Lausanne, Switzerland. The complete dataset covers a century of historical demography in Lausanne (1805-1898), which corresponds to 18,831 pages, and nearly 6 mi ...
Zenodo2023

Unsupervised Term Extraction for Highly Technical Domains

Diego Matteo Antognini

Term extraction is an information extraction task at the root of knowledge discovery platforms. Developing term extractors that are able to generalize across very diverse and potentially highly technical domains is challenging, as annotations for domains r ...
2022

Towards effective visual information storage on DNA support

Touradj Ebrahimi, Michela Testolina, Luka Secilmis

DNA is an excellent medium for efficient storage of information. Not only it offers a long-term and robust mechanism but also it is environmental friendly and has an unparalleled storage capacity, However, the basic elements in DNA are quaternary, and ther ...
2022

GenIE: Generative Information Extraction

Robert West, Maxime Jean Julien Peyrard, Martin Josifoski

Structured and grounded representation of text is typically formalized by closed information extraction, the problem of extracting an exhaustive set of (subject, relation, object) triplets that are consistent with a predefined set of entities and relations ...
ASSOC COMPUTATIONAL LINGUISTICS-ACL2022

Benchmarking JPEG XL image compression

Touradj Ebrahimi, Evgeniy Upenik

JPEG XL is a practical, royalty-free codec for scalable web distribution and efficient compression of high-quality photographs. It also includes previews, progressiveness, animation, transparency, high dynamic range, wide color gamut, and high bit depth. U ...
SPIE2020

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.