Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The immutable Vector collection in the Scala library offers nearly constant-time random access reads thanks to its underlying wide tree data structure. Furthermore, it provides amortized constant time sequential read, update, append and prepend operations by efficiently sharing parts of the tree between different immutable vectors. However, the performance of parallel operations is hindered by the overhead of re-combining the results obtained from parallel workers. Recent research has shown that parallel performance can be improved by relaxing the wide tree invariants and thus improving the performance of the combine operation. Although tree sharing is still used, sequential read, update, append and prepend are no longer amortized constant time. This prevents the new approach from making its way into the Scala Collections library. The main insight of this thesis is that relaxed-invariant vector trees can be seen as a composition of strict and irregular parts. Therefore earlier optimizations can still be applied to strict subtrees, while irregular subtrees bring their own new optimization opportunities. This allows our implementation to provide both amortized constant-time sequential operations and efficient parallel execution at the same time. Our implementation, which is fully compatible with Scala Collections, matches the sequential performance of standard vectors in most cases. At the same time benchmarks show parallel operations execute up to 2.3X faster on 4 threads on a 4 core machine thanks to the relaxed invariants allowing fast re-combination of results from different parallel executions.
, , ,
Michel Bierlaire, Nikola Obrenovic, Selin Ataç