Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Implementing machine learning algorithms for large data, such as the Web graph and social networks, is challenging. Even though much research has focused on making sequential algorithms more scalable, their running times continue to be prohibitively long. Meanwhile, parallelization remains a formidable challenge for this class of problems, despite frameworks like MapReduce which hide much of the associated complexity. We present three ongoing efforts within our team, previously presented at venues in other fields, which aim to make it easier for machine learning researchers and practitioners alike to quickly implement and experiment with their algorithms in a parallel or distributed setting. Furthermore, we hope to highlight some of the language features unique to the Scala programming language in the treatment of our frameworks, in an effort to show how these features can be used to produce efficient and correct parallel systems more easily than ever before.
Guillaume André Fradji Martres