Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
The dramatic rise of streaming time-series data produced in a vari- ety of contexts, such as stock markets, mobile sensing, sensor net- works, data centre monitoring, etc., has fuelled the development of large-scale distributed real-time computation systems ( e.g., Apache Storm, Spark Streaming, S4, etc.). However, it is still unclear how certain important tasks, which can be performed with relative ease in a centralized system, could be performed using such distributed systems. In this paper, we focus on one such task of continu- ously discovering correlations among a large number of stream- ing time series. While doing so, we address two key challenges: (1) the number of time-series pairs that have to be analyzed grows quadratically (O(n2)) in the number of time-series n, giving rise to a quadratic increase in the communication cost between differ- ent nodes of the distributed system, (2) as the size of the time series grows, the computational and communication costs again increase at a prohibitive rate. To tackle these challenges, we propose an approach referred to as AEGIS. AEGIS approximates a group of streams using affine trans- formations. Then it only communicates these stream groups, which are smaller in size and therefore significantly reduces the communi- cation overhead. Secondly, AEGIS dramatically enhances the com- putational efficiency by exploiting the properties of affine transfor- mations to prune the number of evaluated correlations. As for base- lines we adapt well-known centralized correlation computation ap- proaches to the distributed environment. Our extensive experimen- tal evaluations on real and synthetic datasets establish that AEGIS outperforms the baseline approaches in terms of communication cost, processing latency, and peak capacity.
David Atienza Alonso, Marina Zapater Sancho, Arman Iranfar, William Andrew Simon
Pascal Frossard, Thomas Maugey, Rui Ma
Ali H. Sayed, Jie Chen, Roula Nassif