Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing
Related publications (34)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
The constant flux of data and queries alike has been pushing the boundaries of data analysis systems. The increasing size of raw data files has made data loading an expensive operation that delays the data-to-insight time. To alleviate the loading cost, in ...
A simple task of storing a database or transferring it to a different point via a communication channel turns far more complex as the size of the database grows large. Limited bandwidth available for transmission plays a central role in this predicament. I ...
This paper introduces an approach to supporting high-dimensional data cubes at interactive query speeds and moderate storage cost. The approach is based on binary(-domain) data cubes that are judiciously partially materialized; the missing information can ...
Core to many scientific and analytics applications are spatial data capturing the position or shape of objects in space, and time series recording the values of a process over time. Effective analysis of such data requires a shift from confirmatory pipelin ...
Advances in data acquisition technologies and supercomputing for large-scale simulations have led to an exponential growth in the volume of spatial data. This growth is accompanied by an increase in data complexity, such as spatial density, but also by mor ...
Data cleaning has become an indispensable part of data analysis due to the increasing amount of dirty data. Data scientists spend most of their time preparing dirty data before it can be used for data analysis. Existing solutions that attempt to automate t ...
Amid a data revolution that is transforming industries around the globe, computing systems have undergone a paradigm shift where many applications are scaled out to run on multiple computers in a computing cluster. As the storage and processing capabilitie ...
EPFL2020
In the current era of big data, aggregation queries on high-dimensional datasets are frequently utilized to uncover hidden patterns, trends, and correlations critical for effective business decision-making. Data cubes facilitate such queries by employing p ...
EPFL2023
,
This paper introduces an approach to supporting high-dimensional data cubes at interactive query speeds and moderate storage cost. The approach is based on binary(-domain) data cubes that are judiciously partially materialized; the missing information can ...
ASSOC COMPUTING MACHINERY2022
,
We consider a data owner that outsources its dataset to an untrusted server. The owner wishes to enable the server to answer range queries on a single attribute, without compromising the privacy of the data and the queries. There are several schemes on “pr ...