Rock You like a Hurricane: Taming Skew in Large Scale Analytics
Publications associées (35)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
In this thesis, we explore the application of data mining and machine learning techniques to several practical problems. These problems have roots in various fields such as social science, economics, and political science. We show that computer science tec ...
Scalable join processing in a parallel shared-nothing environment requires a partitioning policy that evenly distributes the processing load while minimizing the size of state maintained and number of messages communicated. Previous research proposes stati ...
The constant increase in single core frequency reached a plateau during recent years since the produced heat inside the chip cannot be cooled down by existing technologies anymore. An alternative to harvest more computational power per die is to fabricate ...
Today, managing, storing and analyzing data continuously in order to gain additional insight is becoming commonplace. Data analytics engines have been traditionally optimized for read-only queries assuming that the main data reside on mechanical disks. The ...
In the last years the process of examining large amounts of different types of data, or Big-Data, in an effort to uncover hidden patterns or unknown correlations has become a major need in our society. In this context, stream mining applications are now wi ...
The ALICE experiment at CERN LHC is using a PROOF-enabled cluster for fast physics analysis, detector calibration and reconstruction of small data samples. The current system (CAF - CERN Analysis Facility) consists of some 120 CPU cores and about 45 TB of ...
In one example, a device for retrieving multimedia data, the device comprising one or more processors configured to retrieve data of a first segment of a representation of multimedia content in accordance with data of a copy of a manifest file stored by th ...
We consider MapReduce workloads that are produced by analytics applications. In contrast to ad hoc query workloads, analytics applications are comprised of fixed data flows that are run over newly arriving data sets or on different portions of an existing ...
The thesis is a contribution to extreme-value statistics, more precisely to the estimation of clustering characteristics of extreme values. One summary measure of the tendency to form groups is the inverse average cluster size. In extreme-value context, th ...
Increased market pressure, sharp competition and globalisation are some of the main challenges faced nowadays by companies, pushing them to continuously evaluate the suitability of their business model and to look for new opportunities. Electronic commerce ...