Rock You like a Hurricane: Taming Skew in Large Scale Analytics
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
In this thesis, we explore the application of data mining and machine learning techniques to several practical problems. These problems have roots in various fields such as social science, economics, and political science. We show that computer science tec ...
Increased market pressure, sharp competition and globalisation are some of the main challenges faced nowadays by companies, pushing them to continuously evaluate the suitability of their business model and to look for new opportunities. Electronic commerce ...
In the last years the process of examining large amounts of different types of data, or Big-Data, in an effort to uncover hidden patterns or unknown correlations has become a major need in our society. In this context, stream mining applications are now wi ...
The ALICE experiment at CERN LHC is using a PROOF-enabled cluster for fast physics analysis, detector calibration and reconstruction of small data samples. The current system (CAF - CERN Analysis Facility) consists of some 120 CPU cores and about 45 TB of ...
Scalable join processing in a parallel shared-nothing environment requires a partitioning policy that evenly distributes the processing load while minimizing the size of state maintained and number of messages communicated. Previous research proposes stati ...
The constant increase in single core frequency reached a plateau during recent years since the produced heat inside the chip cannot be cooled down by existing technologies anymore. An alternative to harvest more computational power per die is to fabricate ...
The thesis is a contribution to extreme-value statistics, more precisely to the estimation of clustering characteristics of extreme values. One summary measure of the tendency to form groups is the inverse average cluster size. In extreme-value context, th ...
Today, managing, storing and analyzing data continuously in order to gain additional insight is becoming commonplace. Data analytics engines have been traditionally optimized for read-only queries assuming that the main data reside on mechanical disks. The ...
In one example, a device for retrieving multimedia data, the device comprising one or more processors configured to retrieve data of a first segment of a representation of multimedia content in accordance with data of a copy of a manifest file stored by th ...
We consider MapReduce workloads that are produced by analytics applications. In contrast to ad hoc query workloads, analytics applications are comprised of fixed data flows that are run over newly arriving data sets or on different portions of an existing ...