Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
We consider sliding window query execution scheduling in stream processing engines. Sliding windows are an essential building block to limit the query focus at a particular part of the stream, based either on value count or time ranges. These so called sliding window predicates specify the execution condition for the query. Due to the often massive amount of registered queries, efficient algorithms to check these predicates are essential. While there exists a comprehensive set of works on the stream processing techniques, the actual algorithms to intelligently decide on the sliding behaviors is not extensively addressed in the existing works. In this paper we propose a set of algorithms for managing and sharing sliding decisions. This work introduces the concept of the batch sliding and sliding graphs to improve the sliding decision of the stream processing engines. Our algorithms can be efficiently used in large-scale stream processing systems where data arrives at high rates and a large number of user queries are registered to these data streams. Our evaluation results show the suitability of this approach in the real world applications.
Marco Mattavelli, Simone Casale Brunet, Aurélien François Gilbert Bloch
Aurélien François Gilbert Bloch