Publication

Hardware-Software Co-Design of an RPC Processor

Arash Pourhabibi Zarandi
2021
EPFL thesis
Abstract

The booming popularity of online services has led to a major evolution in the way these services are built and deployed. To cope with such online data-intensive services, service providers deploy several massive-scale datacenters, also referred to as warehouse-scale computers, each populated with up to hundreds of thousands of servers. The services also follow the paradigm of microservices, which decomposes online services into fine-grained software modules frequently communicating over the datacenter network using Remote Procedure Calls (RPCs). Microservices simplify and accelerate software development and allow independent development and performance debugging of each microservice using the most suitable programming language and tools. Furthermore, microservices simplify software deployment and enable scaling and updating individual microservices independently. However, because services are deployed in a distributed fashion, frequent communication is needed to complete a request, putting pressure on the networking infrastructure of the datacenter.As a result, networking technology has been evolving rapidly both in software and hardware to address this extra communication overhead, also referred to as the "RPC tax" in datacenters. High-performance network fabrics and new network protocols have been developed to address the performance and scalability issues associated with the increasing volume of communication between software components. Although the tax on inter-microservice communication includes both the RPC layer and the underlying network stack, ongoing advancements have mainly targeted the network stack, leading to a drastic reduction of the networking latency and exposing the RPC layer itself as a bottleneck. While modern fabrics continue improving network bandwidth, silicon's efficiency and density scaling met an abrupt slowdown with the end of Dennard scaling and the slowdown of Moore's law, putting more pressure on the RPC layer running on the general-purpose CPUs. Overall, the RPC layer accounts for a significant fraction of both a single request's latency and the datacenter's total compute capacity; thus, optimizing the hardware-software stack for RPCs is of critical importance.In this thesis, we break down the underlying modules that comprise production RPC layers and show that CPUs can only expect limited improvements for such tasks, mandating a shift to hardware to remove the RPC layer as a limiter of microservice performance. Motivated by the growing RPC tax in datacenters, we advocate for hardware-software co-design to evade the RPC tax. We present design principles guiding the architecture of an RPC processor and show that conclusively removing the RPC layer bottleneck requires all of the RPC layer's modules to be executed by a NIC-attached hardware accelerator. We propose a NIC-integrated RPC processor that runs production RPC layers and acts as an intermediary stage between the NIC and the microservice running on the CPU. Because such an RPC processor can peek into the request's data, it opens up further opportunities such as intelligent load balancing and request dispatch. We make the case that such an RPC processor is an ideal candidate for inclusion in future server chips to better support and run microservices as they decompose into even finer granularity.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (46)
Microservices
In software engineering, a microservice architecture is a variant of the service-oriented architecture structural style. It is an architectural pattern that arranges an application as a collection of loosely coupled, fine-grained services, communicating through lightweight protocols. One of its goals is that teams can develop and deploy their services independently of others. This is achieved by the reduction of several dependencies in the code base, allowing developers to evolve their services with limited restrictions from users, and for additional complexity to be hidden from users.
Service-oriented architecture
In software engineering, service-oriented architecture (SOA) is an architectural style that focuses on discrete services instead of a monolithic design. By consequence, it is also applied in the field of software design where services are provided to the other components by application components, through a communication protocol over a network. A service is a discrete unit of functionality that can be accessed remotely and acted upon and updated independently, such as retrieving a credit card statement online.
Remote procedure call
In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to execute in a different address space (commonly on another computer on a shared network), which is written as if it were a normal (local) procedure call, without the programmer explicitly writing the details for the remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote.
Show more
Related publications (36)

Building Chips Faster: Hardware-Compiler Co-Design for Accelerated RTL Simulation

Sahand Kashani

The demise of Moore's Law and Dennard scaling has resulted in diminishing performance gains for general-purpose processors, and so has prompted a surge in academic and commercial interest for hardware accelerators.Specialized hardware has already redefined ...
EPFL2023

Hardware and Software Support for RPC-Centric Server Architecture

Mark Johnathon Sutherland

Online services have become ubiquitous in technological society, the global demand for which has driven enterprises to construct gigantic datacenters that run their software. Such facilities have also recently become a substrate for third-party organizatio ...
EPFL2022

Optimus Prime: Accelerating Data Transformation in Servers

Babak Falsafi, Christoph Koch, Siddharth Gupta, Mario Paulo Drumond Lages De Oliveira, Mark Johnathon Sutherland, Arash Pourhabibi Zarandi, Zilu Tian, Hussein Kassir

Modern online services are shifting away from monolithic applications to loosely-coupled microservices because of their improved scalability, reliability, programmability and development velocity. Microservices communicating over the datacenter network req ...
ACM2020
Show more
Related MOOCs (6)
IoT Systems and Industrial Applications with Design Thinking
The first MOOC to provide a comprehensive introduction to Internet of Things (IoT) including the fundamental business aspects needed to define IoT related products.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.