Adaptive partitioning and indexing for in situ query processing
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Advances in data acquisition technologies and supercomputing for large-scale simulations have led to an exponential growth in the volume of spatial data. This growth is accompanied by an increase in data complexity, such as spatial density, but also by mor ...
Core to many scientific and analytics applications are spatial data capturing the position or shape of objects in space, and time series recording the values of a process over time. Effective analysis of such data requires a shift from confirmatory pipelin ...
A simple task of storing a database or transferring it to a different point via a communication channel turns far more complex as the size of the database grows large. Limited bandwidth available for transmission plays a central role in this predicament. I ...
The constant flux of data and queries alike has been pushing the boundaries of data analysis systems. The increasing size of raw data files has made data loading an expensive operation that delays the data-to-insight time. Hence, recent in-situ query proce ...
The typical enterprise data architecture consists of several actively updated data sources (e.g., NoSQL systems, data warehouses), and a central data lake such as HDFS, in which all the data is periodically loaded through ETL processes. To simplify query p ...
Industry and academia are continuously becoming more data-driven and data-intensive, relying on the analysis of a wide variety of datasets to gain insights. At the same time, data variety increases continuously across multiple axes. First, data comes in mu ...
Modern industrial, government, and academic organizations are collecting massive amounts of data at an unprecedented scale and pace. The ability to perform timely, predictable and cost-effective analytical processing of such large data sets in order to ext ...
Many analytics applications generate mixed workloads, i.e., workloads comprised of analytical tasks with different processing characteristics including data pre-processing, SQL, and iterative machine learning algorithms. Examples of such mixed workloads ca ...
In recent years time series data has become ubiquitous thanks to affordable sensors and advances in embedded technology. Large amount of time-series data are continuously produced in a wide spectrum of applications, such as sensor networks, medical monitor ...
We propose fingerprinting, a new technique that consists in constructing compact, fast-to-compute and privacy-preserving binary representations of datasets. We illustrate the effectiveness of our approach on the emblematic big data problem of K-Nearest-Nei ...