Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The consensus problem is a fundamental paradigm for fault-tolerant distributed computing. It abstracts a family of problems known as agreement problems, e.g., atomic broadcast, atomic commitment, group membership, and leader election. Any solution to Consensus can serve as a basic building block for solving such problems. It is well known that Consensus cannot be solved deterministically in an asynchronous system that is subject to even a single process crash. This impossibility result stems from the lack of reliable failure detection in such a system. To circumvent such impossibility results, Chandra and Toueg introduced the concept of unreliable failure detectors, and show how to use them to solve Consensus. Instead of giving specific implementations, they characterized failure detectors in terms of abstract properties. This thesis focuses on the design of algorithms implementing failure detectors and its use to solve Consensus. We consider a partially synchronous system in which Consensus has been proven to be solvable. We first show that no one of the failure detector classes satisfying perpetual accuracy can be implemented in such a system. We then propose efficient algorithms implementing all the failure detector classes satisfying eventual accuracy. We also propose an optimal algorithm implementing diamond S, the weakest failure detector class for solving Consensus. We also propose a new class of failure detectors, the Eventually Consistent class, denoted diamond C. This class adds an eventual leader election capability to the failure detection capability of other classes. We show that diamond C is equivalent to diamond S. We also show that diamond C can be implemented as efficiently as diamond S. We propose a Consensus algorithm that successfully exploits the leader election capability of diamond C and performs better that all the previously proposed algorithms, which shows the potential of this new class of failure detectors.
Jiri Vanicek, Alan Scheidegger, Nikolay Golubev