Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Data storage is one of today’s fundamental services with companies, universities and research centers having the need of storing large amounts of data every day. Cloud storage services are emerging as strong alternative to local storage, allowing customers to save costs of buying and maintaining expensive hardware. Several solutions are available on the market, the most famous being Amazon S3. However it is rather difficult to access information about each service architecture, performance, and pricing. To shed light on storage services from the customer perspective, we propose a benchmarking methodology, apply it to four popular offers (Amazon S3, Amazon Glacier, Windows Azure Blob and Rackspace Cloud Files), and compare their performance. Each service is analysed as a black box and benchmarked through crafted workloads.We take the perspective of a customer located in Europe, looking for possible service providers and the optimal data center where to deploy its applications. At last, we complement the analysis by comparing the actual and forecast costs faced when using each service. According to collected results, all services show eventual weaknesses related to some workload, with no all-round eligible winner, e.g., some offers providing excellent or poor performance when exchanging large or small files. For all services, it is of paramount importance to accurately select the data center to where deploy the applications, with throughput that varies by factors from 2x to 10x. The methodology (and tools implementing it) here presented is instrumental for potential customers to identify the most suitable offer for their needs.
Touradj Ebrahimi, Michela Testolina
Touradj Ebrahimi, Michela Testolina