CleanM: An Optimizable Query Language for Unified Scale-Out Data Cleaning
Publications associées (41)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Data cleaning is a time-consuming process that depends on the data analysis that users perform. Existing solutions treat data cleaning as a separate offline process that takes place before analysis begins. Applying data cleaning before analysis assumes a p ...
In this paper we lay the groundwork for a robust cross-device comparison of data-driven disruption prediction algorithms on DIII-D and JET tokamaks. In order to consistently carry on a comparative analysis, we define physics-based indicators of disruption ...
This paper introduces an approach to supporting high-dimensional data cubes at interactive query speeds and moderate storage cost. The approach is based on binary(-domain) data cubes that are judiciously partially materialized; the missing information can ...
This dataset supports the publication 'Elastocapillary menisci mediate interaction of neighboring structures at the surface of a compliant solid' by Lebo Molefe and John M. Kolinski, Physical Review E, (2023). The data are surface profiles of textured surf ...
Data cleaning has become an indispensable part of data analysis due to the increasing amount of dirty data. Data scientists spend most of their time preparing dirty data before it can be used for data analysis. Existing solutions that attempt to automate t ...
This dataset contains corrected particle number concentration data measured during the year-long MOSAiC expedition from October 2019 to September 2020. Some periods of the measurements were affected by repeated step changes in the particle number concentra ...
Data cleaning is a time-consuming process that depends on the data analysis that users perform. Existing solutions treat data cleaning as a separate offline process that takes place before analysis begins. Applying data cleaning before analysis assumes a p ...
This dataset contains corrected particle number concentration data measured during the year long MOSAiC expedition from October 2019 to September 2020. The measurement was performed in the Swiss aerosol container on the D-deck of Research Vessel Polarstern ...
Advances in data acquisition technologies and supercomputing for large-scale simulations have led to an exponential growth in the volume of spatial data. This growth is accompanied by an increase in data complexity, such as spatial density, but also by mor ...
Core to many scientific and analytics applications are spatial data capturing the position or shape of objects in space, and time series recording the values of a process over time. Effective analysis of such data requires a shift from confirmatory pipelin ...