Recovery in Distributed Systems Using Optimistic Message Logging and Checkpointing
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Message-based communication offers potential benefits of providing stronger specification and cleaner separation between components. Compared with shared-memory interactions, message passing has the potential disadvantages of being more expensive (no direc ...
Since the introduction of the concept of failure detectors, several consensus and atomic broadcast algorithms based on these detectors have been published. The performance of these algorithms is often affected by a trade-off between the number of communica ...
Within only a couple of generations, the so-called digital revolution has taken the world by storm: today, almost all human beings interact, directly or indirectly, at some point in their life, with a computer system. Computers are present on our desks, co ...
Group communication is a programming abstraction that allows a distributed group of processes to provide a reliable service in spite of the possibility of failures within the group. The goal of the project was to improve the state of the art of group commu ...
The last three decades have seen computers invading our society: computers are now present at work to improve productivity and at home to enlarge the scope of our hobbies and to communicate. Furthermore, computers have been involved in many critical system ...
Improving the dependability of computer systems is a critical and essential task. In this context, the paper surveys techniques that allow to achieve fault tolerance in distributed systems by replication. The main replication techniques are first explained ...
Embedded sensors and actuators are revolutionizing the way we perceive and interact with the physical world. Current research on such Cyber-Physical Systems (CPSs) has focused mostly on distributed sensing and data gathering. The next step is to move from ...
It is notoriously hard to develop dependable distributed systems. This is partly due to the difficulties in foreseeing various corner cases and failure scenarios while implementing a system that will be deployed over an asynchronous network. In contrast, r ...
Problems in fault-tolerant distributed computing have been studied in a variety of models. These models are structured around two central ideas: Degree of synchrony and failure model are two independent parameters that determine a particular type of system ...
Nowadays, computers are the indispensable part of our life. They evolve rapidly and are more and more versatile. Computer networks made the remote corners of the world just a click away. But unavoidably, any software and hardware component is subject to fa ...