Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.
DNA sequence comparison and database search have evolved in the last years as a field of strong competition between several reconfigurable hardware computing groups. In this paper we present a BLAST preprocessor that efficiently marks the parts of the database that may produce matches. Our prefiltering approach offers significant reduction in the size of the database that needs to be fully processed by BLAST, with a corresponding reduction in the run-time of the algorithm. We have implemented our architecture, evaluated its effectiveness for a variety of databases and queries, and compared its accuracy against the original NCBI Blast implementation. We have found that prefiltering offers at least a factor of 5 and up to 3 orders of magnitude reduction in the database space that needs to be fully searched. Due to its prefiltering nature, our approach can be combined with all major reconfigurable acceleration architectures that have been presented up to date.
Rolf Gruetter, Maria del Carmen Sandi Perez, Lijing Xin, Gediminas Luksys, Alina Veronika Irene Strasser
Anastasia Ailamaki, Iraklis Psaroudakis