Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The scale and pervasiveness of the Internet make it a pillar of planetary communication, industry and economy, as well as a fundamental medium for public discourse and democratic engagement. In stark contrast with the Internet's decentralized infrastructure, however, mostly all hyper-scale online services adopt a centralized architecture, wherein millions or billions of clients entrust the same service provider with sensitive data to store, process and transmit, resulting in poor security and a dangerous asymmetry of power between users and system maintainer. Byzantine distributed computing has long held the promise to advance the security and transparency of our information infrastructure, replicating services across several, reciprocally mistrusting processes so as to uphold security and transparency even if a fraction of the processes deviate arbitrarily from the algorithm they are assigned. Despite extensive research, however, real-world replicated systems still fall short of global adoption. This is at least in part due to the limited scalability of Byzantine algorithms: the cost of replication is still too high, preventing the arisal of Byzantine hyper-scalers.This thesis attempts to overcome this limitation, contributing planetary-scale implementations for two classes of distributed abstractions: Reliable Broadcast, which has correct processes agree on messages issued by individual sources, and Consensus, which provides agreement on values contributed by multiple independent sources. The two classes of abstraction differ in power and requirements: Consensus is more powerful, enabling universal distributed computation, but Reliable Broadcast can be implemented without any assumption on network timeliness, a better fit for the public Internet's large latency spikes.First, we scale the number of servers that can take part in Reliable Broadcast. We present Contagion, the first probabilistic Reliable Broadcast protocol to achieve logarithmic per-process computation and communication complexity, enabling scalability to a practically unlimited number of servers. At the core of Contagion are samples, a probabilistic alternative to Byzantine quorums foregoing intersection guarantees for statistical representativeness. Contagion's security evaluation is among the main technical contributions of this thesis, providing the first formal analysis of a probabilistic protocol in the Byzantine setting.Second, we scale the number of clients that can concurrently submit messages to a Reliable Broadcast system. We present Draft, the first implementation of Reliable Broadcast to asymptotically amortize, in the good case, all signature and communication overhead resulting from Byzantine fault tolerance, matching the complexity of a trusted, centralized solution. Draft's efficiency is enabled by brokers, a novel layer of untrusted processes we introduce between clients and servers to enhance system performance.Finally, we scale the number of clients a Consensus-class system can handle. We do so with Chop Chop, the first system to introduce brokers to the Atomic Broadcast abstraction. When geo-deployed on 64 medium-sized servers, Chop Chop processes 43,600,000 messages per second, two orders of magnitude more than state-of-the-art alternatives. Even at maximum load, Chop Chop's performance is close to line-rate, putting 92% of server bandwidth towards transmitting application messages, negating nearly all cost resulting from Byzantine resilience.
Rachid Guerraoui, Jovan Komatovic, Pierre Philippe Civit, Manuel José Ribeiro Vidigueira, Vincent Gramoli, Seth Gilbert
Rachid Guerraoui, Gauthier Jérôme Timothée Voron, Vincent Gramoli, Mihail Igor Zablotchi, Karolos Antoniadis, Antoine Philippe Matthieu Desjardins