Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
As data analytics is used by an increasing number of applications, data analytics engines are required to execute workloads with increased concurrency, i.e., an increasing number of clients submitting queries. Data management systems designed for data analytics - a market dominated by column-stores - however, were initially optimized for single query execution, minimizing its response time. Hence, they do not treat concurrency as a first class citizen. In this paper, we experiment with one open-source and two commercial column-stores using the TPC-H and SSB benchmarks in a setup with an increasing number of concurrent clients submitting queries, focusing on whether the tested systems can scale up in a single node instance. The tested systems for in-memory workloads scale up, to some degree; however, when the server is saturated they fail to fully exploit the available parallelism. Further, we highlight the unpredictable response times for high concurrency.
Anastasia Ailamaki, Panagiotis Sioulas, Eleni Zapridou
David Atienza Alonso, Miguel Peon Quiros, Simone Machetti, Pasquale Davide Schiavone
Anastasia Ailamaki, Periklis Chrysogelos, Angelos Christos Anadiotis, Syed Mohammad Aunn Raza