Publication

The Overhead of Consensus Failure Recovery

Rachid Guerraoui, Partha Dutta
2004
Rapport ou document de travail

Résumé

Many reliable distributed systems are consensus-based and typically operate under two modes: a fast normal mode in failure-free periods, and a slower recovery mode following failures. A lot of work has been devoted to optimizing the normal mode, but little has been devoted to optimizing the recovery mode. This work seeks to understand whether the recovery mode is inherently slower than the normal mode. We focus on consensus algorithms that tolerate arbitrary periods of asynchrony as well as the crash of a minority of processes, and study the time-complexity of their recovery mode in periods of synchrony. We show that recovery induces an inherent overhead of one communication round. More precisely, we show that consensus algorithms that can tolerate arbitrary periods of asynchrony, require three rounds even in synchronous runs where all crashes are initial. This is in contrast to the well-known tight bound of two rounds on failure-free synchronous runs of such algorithms. We also give for the first time a consensus algorithm that is optimal in both the normal mode and the recovery mode of synchronous runs. Moreover, our algorithm recovers from arbitrary periods of asynchrony significantly faster than any other consensus algorithm we are familiar with. The algorithm uses the weakest failure detector for consensus, Omega.

Source officielle

https://infoscience.epfl.ch/record/52648?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés (25)

Publications associées (32)

MOOCs associés (23)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

The Overhead of Consensus Failure Recovery

Graph Chatbot

Chattez avec Graph Search

A ride time-oriented scheduling algorithm for dial-a-ride problems

Energy Management of Price-Maker Community Energy Storage by Stochastic Dynamic Programming

Planetary-Scale Byzantine Fault Tolerance

A ride time-oriented scheduling algorithm for dial-a-ride problems

Energy Management of Price-Maker Community Energy Storage by Stochastic Dynamic Programming

Planetary-Scale Byzantine Fault Tolerance