Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This paper discusses Kamusi Pre:D, a system to improve translation by disambiguating word senses in a source document with reference to a large concept-based lexicon that is aligned by sense across numerous languages. Currently under active development, the program prompts users to select the intended meaning when polysemous terms occur, and gives the user the option to select multiword expressions instead of individual words when the MWE occurs as a lexicalized dictionary entry. The disambiguated text is then automatically matched to sense-specific translation equivalents that have been aligned across languages. Pre:D is intended to integrate with existing translation tools, but greatly improve accuracy by involving human intelligence in vocabulary selection, both through manual document review of ambiguous terms and by reference to the underlying curated multilingual Kamusi dictionary data. Pre:D will aid accurate vocabulary translation among a wide range of language pairs, most currently unserved, and offer significant advantages in time, effort, and quality for multilingual translation projects by disambiguating a document one time for concepts that can be rendered appropriately across numerous languages.
Lesly Sadiht Miculicich Werlen
Grégoire Courtine, Vincent Delattre, Marco Capogrosso, Fabien Bertrand Paul Wagner, Karen Minassian