Skadi: Building a Distributed Runtime for Data Systems in Disaggregated Data Centers

Data-intensive systems are the backbone of today's computing and are responsible for shaping data centers. Over the years, cloud providers have relied on three principles to maintain cost-effective data systems: use disaggregation to decouple scaling, use domain-specific computing to battle waning laws, and use serverless to lower costs. Although they work well individually, they fail to work in harmony: an issue amplified by emerging data system workloads. In this paper, we envision a distributed runtime to mitigate current shortcomings. The distributed runtime has a tiered access layer exposing declarative APIs, underpinned by a stateful serverless runtime with a distributed task execution model. It will be the narrow waist between data systems and hardware. Users are oblivious to data location, concurrency, disaggregation style, or even the hardware to do the computing. The underlying stateful serverless runtime transparently evolves with novel data-center architectures, such as disaggregation and tightly-coupled clusters. We prototype Skadi to showcase that the distributed runtime is practical.

Skadi: Building a Distributed Runtime for Data Systems in Disaggregated Data Centers

Graph Chatbot

Chat with Graph Search

Special Session: Challenges and Opportunities for Sustainable Multi-Scale Computing Systems

Next-generation brain observatories

Clouseau: Blockchain-based Data Integrity for HDFS Clusters

Special Session: Challenges and Opportunities for Sustainable Multi-Scale Computing Systems

Next-generation brain observatories

Clouseau: Blockchain-based Data Integrity for HDFS Clusters