This paper presents a new approach to measuring similarity over massive time-series data. Our approach is built on two principles: one is to parallelize the large amount computation using a scalable cloud serving system, called TimeCloud. The another is to benefit from the filter-and-refinement approach for query processing, such that similarity computation is efficiently performed over approximated data at the filter step, and then the following refinement step measures precise similarities for only a small number of candidates resulted from the filtering. To this end, we establish a set of firm theoretical backgrounds, as well as techniques for processing kNN queries. Our experimental results suggest that the approach proposed is efficient and scalable.
Pascal Frossard, Mireille El Gheche, Matthias Minder, Zahra Farsijani
Pramod Rastogi, Rishikesh Dilip Kulkarni