Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In cloud storage systems, hot data is usually replicated over multiple disks/servers in order to accommodate simultaneous access by multiple users as well as increase the fault tolerance of the system. Recent cloud storage research has proposed using availability codes, which is a special class of erasure codes, as a more storage-efficient way to store hot data. These codes enable data recovery from multiple, small disjoint groups of servers. The number of the recovery groups is referred to as the availability and the size of each group as the locality of the code. Up till now, we have very limited knowledge on how code locality and availability affect data access time. Data download from these systems involves multiple fork-join queues operating in-parallel, making the analysis of access time a very challenging problem. In this paper, we present an analysis of average data access time in storage systems employing simplex codes, which are an important, in certain sense optimal, class of availability codes. We generalize the analysis for codes with locality 2 and any degree of availability. Specifically, using a queueing theoretic approach, we derive bounds and approximations on the average response time for two different Poisson request arrival models. We also compare two scheduling strategies for reduced access time and load balancing.2
David Atienza Alonso, Marina Zapater Sancho, Ali Pahlevan, Kosar Haghshenas
Michael Christoph Gastpar, Sung Hoon Lim, Jingge Zhu, Erixhen Sula, Adriano Pastore
Touradj Ebrahimi, Michela Testolina