Gossip, also known as epidemic dissemination, is becoming an increasingly popular technique in distributed systems. Yet, it has remained a partially open question: how robust are such protocols? We consider a natural extension of the random phone-call model (introduced by Karp et al. [KarpFOCS-2000]), and we analyze two different notions of robustness: the ability to tolerate adaptive failures, and the ability to tolerate oblivious failures. For adaptive failures, we present a new gossip protocol, TrickleGossip, which achieves near-optimal message complexity. To the best of our knowledge, this is the first epidemic-style protocol that can tolerate adaptive failures. We also show a direct relation between resilience and message complexity, demonstrating that gossip protocols which tolerate a large number of adaptive failures need to use a super-linear number of messages with high probability. For oblivious failures, we present a new gossip protocol, CoordinatedGossip, that achieves optimal message complexity. This protocol makes novel use of the universe reduction technique to limit the message complexity.
Rachid Guerraoui, Alexandre David Olivier Maurer
Boi Faltings, Sujit Prakash Gujar, Dimitrios Chatzopoulos, Anurag Jain
Rachid Guerraoui, Jovan Komatovic, Dragos-Adrian Seredinschi, Andrei Tonkikh