Publication

Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing

Related publications (34)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Interactive-time Exploration, Querying, and Analysis of Large High-dimensional Datasets

Sachin Basil John

In the current era of big data, aggregation queries on high-dimensional datasets are frequently utilized to uncover hidden patterns, trends, and correlations critical for effective business decision-making. Data cubes facilitate such queries by employing p ...

EPFL2023

High-dimensional Data Cubes

Christoph Koch, Sachin Basil John

This paper introduces an approach to supporting high-dimensional data cubes at interactive query speeds and moderate storage cost. The approach is based on binary(-domain) data cubes that are judiciously partially materialized; the missing information can ...

2022

High-dimensional Data Cubes

Christoph Koch, Sachin Basil John

ASSOC COMPUTING MACHINERY2022

Holistic, Efficient, and Real-time Cleaning of Heterogeneous Data

Styliani Asimina Giannakopoulou

Data cleaning has become an indispensable part of data analysis due to the increasing amount of dirty data. Data scientists spend most of their time preparing dirty data before it can be used for data analysis. Existing solutions that attempt to automate t ...

EPFL2021

Adaptive partitioning and indexing for in situ query processing

Anastasia Ailamaki, Manolis Karpathiotakis, Ioannis Alagiannis, Manoussos Gavriil Athanassoulis, Matthaios Alexandros Olma

The constant flux of data and queries alike has been pushing the boundaries of data analysis systems. The increasing size of raw data files has made data loading an expensive operation that delays the data-to-insight time. To alleviate the loading cost, in ...

SPRINGER2020

An Architecture for Load Balance in Computer Cluster Applications

Laurent Bindschaedler

Amid a data revolution that is transforming industries around the globe, computing systems have undergone a paradigm shift where many applications are scaled out to run on multiple computers in a computing cluster. As the storage and processing capabilitie ...

EPFL2020

Time- and Space-Efficient Spatial Data Analytics

Mirjana Pavlovic

Advances in data acquisition technologies and supercomputing for large-scale simulations have led to an exponential growth in the volume of spatial data. This growth is accompanied by an increase in data complexity, such as spatial density, but also by mor ...

EPFL2019

Efficient Query Processing for Spatial and Temporal Data Exploration

Eleni Tzirita Zacharatou

Core to many scientific and analytics applications are spatial data capturing the position or shape of objects in space, and time series recording the values of a process over time. Effective analysis of such data requires a shift from confirmatory pipelin ...

EPFL2019

Caching and Distributed Storage

Saeid Sahraei

A simple task of storing a database or transferring it to a different point via a communication channel turns far more complex as the size of the database grows large. Limited bandwidth available for transmission plays a central role in this predicament. I ...

EPFL2018

Practical Private Range Search in Depth

Odysseas Papapetrou, Ioannis Demertzis

We consider a data owner that outsources its dataset to an untrusted server. The owner wishes to enable the server to answer range queries on a single attribute, without compromising the privacy of the data and the queries. There are several schemes on “pr ...

2018