Evaluating the performance of distributed agreement algorithms
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Since their invention more than half a century ago, computers have gone from being just an handful of expensive machines each filling an entire room, to being an integral part of almost every aspect of modern life. Nowadays computers are everywhere: in our ...
Replication has recently gained attention in the context of fault tolerance for large scale MPI HPC applications. Existing implementations try to cover all MPI codes and to be independent from the underlying library. In this paper, we evaluate the advantag ...
State-machine replication (SMR) is a software technique for tolerating failures and for providing high availability in large-scale systems, through the use of commodity hardware. A replicated state-machine comprises a number of replicas, each of which runs ...
Consensus is one of the key problems in fault-tolerant distributed computing. Although the solvability of consensus is now a well-understood problem, comparing different algorithms in terms of efficiency is still an open problem. In this paper, we address ...
Fault-tolerant computing is the art and science of building computer systems that continue to operate normally in the presence of faults. The fault tolerance field covers a wide spectrum of research area ranging from computer hardware to computer software. ...
Many atomic broadcast algorithms have been published in the last 20 years. Token-based algorithms represent a large class of these algorithms. Interestingly, all the token-based atomic broadcast algorithms rely on a group membership service and none of the ...
Consensus is at the heart of fault-tolerant distributed computing systems. Much research has been devoted to developing algorithms for this particular problem. This paper presents a semi-automatic verification approach for asynchronous consensus algorithms ...
In this paper we present a technique for automatically assessing the amount of damage a small number of participant nodes can inflict on the overall performance of a large distributed system. We propose a feedback-driven tool that synthesizes malicious node ...
It is notoriously difficult to develop reliable, high-performance distributed systems that run over asynchronous networks. Even if a distributed system is based on a well-understood distributed algorithm, its implementation can contain errors arising from ...
A failure detector is a fundamental abstraction in distributed computing. This paper surveys this abstraction through two dimensions. First we study failure detectors as building blocks to simplify the design of reliable distributed algorithms. In particul ...