Publication

Fast Correlation Discovery for Large-Scale Streaming Time-Series Data

Karl Aberer, Saket Sathe, Tian Guo
2014
Report or working paper
Abstract

The dramatic rise of streaming time-series data produced in a vari- ety of contexts, such as stock markets, mobile sensing, sensor net- works, data centre monitoring, etc., has fuelled the development of large-scale distributed real-time computation systems ( e.g., Apache Storm, Spark Streaming, S4, etc.). However, it is still unclear how certain important tasks, which can be performed with relative ease in a centralized system, could be performed using such distributed systems. In this paper, we focus on one such task of continu- ously discovering correlations among a large number of stream- ing time series. While doing so, we address two key challenges: (1) the number of time-series pairs that have to be analyzed grows quadratically (O(n2)) in the number of time-series n, giving rise to a quadratic increase in the communication cost between differ- ent nodes of the distributed system, (2) as the size of the time series grows, the computational and communication costs again increase at a prohibitive rate. To tackle these challenges, we propose an approach referred to as AEGIS. AEGIS approximates a group of streams using affine trans- formations. Then it only communicates these stream groups, which are smaller in size and therefore significantly reduces the communi- cation overhead. Secondly, AEGIS dramatically enhances the com- putational efficiency by exploiting the properties of affine transfor- mations to prune the number of evaluated correlations. As for base- lines we adapt well-known centralized correlation computation ap- proaches to the distributed environment. Our extensive experimen- tal evaluations on real and synthetic datasets establish that AEGIS outperforms the baseline approaches in terms of communication cost, processing latency, and peak capacity.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.