Publication

Adaptive partitioning and indexing for in situ query processing

Related publications (39)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Time- and Space-Efficient Spatial Data Analytics

Mirjana Pavlovic

Advances in data acquisition technologies and supercomputing for large-scale simulations have led to an exponential growth in the volume of spatial data. This growth is accompanied by an increase in data complexity, such as spatial density, but also by mor ...

EPFL2019

Efficient Query Processing for Spatial and Temporal Data Exploration

Eleni Tzirita Zacharatou

Core to many scientific and analytics applications are spatial data capturing the position or shape of objects in space, and time series recording the values of a process over time. Effective analysis of such data requires a shift from confirmatory pipelin ...

EPFL2019

Fingerprinting Big Data: The Case of KNN Graph Construction

Rachid Guerraoui, Anne-Marie Kermarrec, Olivier Ruas

We propose fingerprinting, a new technique that consists in constructing compact, fast-to-compute and privacy-preserving binary representations of datasets. We illustrate the effectiveness of our approach on the emblematic big data problem of K-Nearest-Nei ...

2019

Caching and Distributed Storage

Saeid Sahraei

A simple task of storing a database or transferring it to a different point via a communication channel turns far more complex as the size of the database grows large. Limited bandwidth available for transmission plays a central role in this predicament. I ...

EPFL2018

Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing

Anastasia Ailamaki, Manolis Karpathiotakis, Ioannis Alagiannis, Manoussos Gavriil Athanassoulis, Matthaios Alexandros Olma

The constant flux of data and queries alike has been pushing the boundaries of data analysis systems. The increasing size of raw data files has made data loading an expensive operation that delays the data-to-insight time. Hence, recent in-situ query proce ...

VLDB Endowment2017

No data left behind: real-time insights from a complex data ecosystem

Anastasia Ailamaki, Manolis Karpathiotakis

The typical enterprise data architecture consists of several actively updated data sources (e.g., NoSQL systems, data warehouses), and a central data lake such as HDFS, in which all the data is periodically loaded through ETL processes. To simplify query p ...

2017

Distributed Time Series Analytics

Tian Guo

In recent years time series data has become ubiquitous thanks to affordable sensors and advances in embedded technology. Large amount of time-series data are continuously produced in a wide spectrum of applications, such as sensor networks, medical monitor ...

EPFL2017

Just-in-time Analytics Over Heterogeneous Data and Hardware

Manolis Karpathiotakis

Industry and academia are continuously becoming more data-driven and data-intensive, relying on the analysis of a wide variety of datasets to gain insights. At the same time, data variety increases continuously across multiple axes. First, data comes in mu ...

EPFL2017

Toward timely, predictable and cost-effective data analytics

Renata Borovica-Gajic

Modern industrial, government, and academic organizations are collecting massive amounts of data at an unprecedented scale and pace. The ability to perform timely, predictable and cost-effective analytical processing of such large data sets in order to ext ...

EPFL2016

Runtime Prediction for Scale-Out Data Analytics

Adrian Daniel Popescu

Many analytics applications generate mixed workloads, i.e., workloads comprised of analytical tasks with different processing characteristics including data pre-processing, SQL, and iterative machine learning algorithms. Examples of such mixed workloads ca ...

EPFL2015