Impact of Redundancy on Resilience in Distributed Optimization and Learning

Nirupam Gupta, Shuo Liu
2023
Conference paper

Abstract

This paper considers the problem of resilient distributed optimization and stochastic learning in a server-based architecture. The system comprises a server and multiple agents, where each agent has its own local cost function. The agents collaborate with the server to find a minimum of the aggregate of the local cost functions. In the context of stochastic learning, the local cost of an agent is the loss function computed over the data at that agent. In this paper, we consider this problem in a system wherein some of the agents may be Byzantine faulty and some of the agents may be slow (also called stragglers). In this setting, we investigate the conditions under which it is possible to obtain an "approximate" solution to the above problem. In particular, we introduce the notion of (f, r; epsilon)-resilience to characterize how well the true solution is approximated in the presence of up to f Byzantine faulty agents, and up to r slow agents (or stragglers) - smaller epsilon represents a better approximation. We also introduce a measure named (f, r; epsilon)-redundancy to characterize the redundancy in the cost functions of the agents. Greater redundancy allows for a better approximation when solving the problem of aggregate cost minimization.|In this paper, we constructively show (both theoretically and empirically) that (f, r; O(epsilon))-resilience can indeed be achieved in practice, given that the local cost functions are sufficiently redundant. Our empirical evaluation considers a distributed gradient descent (DGD)-based solution; for distributed learning in the presence of Byzantine and asynchronous agents, we also evaluate a distributed stochastic gradient descent (D-SGD)-based algorithm.

Official source

https://infoscience.epfl.ch/record/308228?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Impact of Redundancy on Resilience in Distributed Optimization and Learning

Graph Chatbot

Chat with Graph Search

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Learning From Heterogeneous Data Based on Social Interactions Over Graphs

Multi-agent reinforcement learning with graph convolutional neural networks for optimal bidding strategies of generation units in electricity markets

Multi-agent reinforcement learning with graph convolutional neural networks for optimal bidding strategies of generation units in electricity markets

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Learning From Heterogeneous Data Based on Social Interactions Over Graphs