Pixels: An Efficient Column Store for Cloud Data Lakes
Related publications (34)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Performance and reliability are important yet conflicting properties of systems software. Software today often crashes, has security vulnerabilities and data loss, while many techniques that could address such issues remain unused due to performance concer ...
EPFL2017
, ,
As data continues to be generated at exponentially growing rates in heterogeneous formats, fast analytics to extract meaningful information is becoming increasingly important. Systems widely use in-memory caching as one of their primary techniques to speed ...
2017
,
The typical enterprise data architecture consists of several actively updated data sources (e.g., NoSQL systems, data warehouses), and a central data lake such as HDFS, in which all the data is periodically loaded through ETL processes. To simplify query p ...
2017
, ,
Enterprise databases use storage tiering to lower capital and operational expenses. In such a setting, data waterfalls from an SSD-based high-performance tier when it is "hot" (frequently accessed) to a disk-based capacity tier and finally to a tape-based ...
Endogeneity is an important issue that often arises in discrete choice models leading to biased estimates of the parameters. We propose the extended multiple indicator solution (EMIS) methodology to correct for it and exemplify it with a case study using r ...
Nowadays, business and scientific applications accumulate data at an increasing pace. This growth of information has already started to outgrow the capabilities of database management systems (DBMS). In a typical DBMS usage scenario, the user should define ...
Many analytics applications generate mixed workloads, i.e., workloads comprised of analytical tasks with different processing characteristics including data pre-processing, SQL, and iterative machine learning algorithms. Examples of such mixed workloads ca ...
This paper is a discussion of a recent publication by Sekkal and coworkers, which investigated the elasticity, cohesion, and stability of the C-S-H (001) surface as a function of porosity using molecular dynamics simulations. In our discussion, we highligh ...
Modern industrial, government, and academic organizations are collecting massive amounts of data at an unprecedented scale and pace. The ability to perform timely, predictable and cost-effective analytical processing of such large data sets in order to ext ...
While moving from diary survey to location-aware technologies, recent data collection techniques provide new insights about location choices. Only few dynamic models of location choice exist in the literature, and none of them to our knowledge correct for ...