Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
Next-generation sequencing has revolutionized the field of genomics by producing accurate, rapid and cost-effective genome analysis with the use of high throughput sequencing technologies. This has intensified the need for accurate and performance efficient genome assemblers to assemble a large set of short reads produced by next-generation sequencing technology. Genome assembly is an NP-hard problem that is computationally challenging. Therefore, the current methods that rely on heuristic and approximation algorithms to assemble genomes prevent them from arriving at the most accurate solution. This paper presents a novel approach by gamifying whole-genome shotgun assembly from next-generation sequencing data; we present "Geno", a human-computing game designed with the aim of improving the accuracy of whole-genome shotgun assembly. We evaluate the feasibility of crowdsourcing the problem of whole-genome shotgun assembly by breaking the problem into small subtasks. The evaluation results, for single-cell Escherichia coli K-12 substr. MG1655 with a read length of 25 bp that produced 144,867 game instances of mean 25 sequences per instance at 40x coverage indicate the feasibility of sub-tasking the problem of genome assembly to be solved using crowdsourcing.
Tamar Kohn, Xavier Fernandez Cassi
Bart Deplancke, Daniel Migliozzi, Gilles Weder, Riccardo Dainese, Daniel Alpern, Hüseyin Baris Atakan, Mustafa Demir, Dariia Gudkova