The Hidden Vulnerability of Distributed Learning in Byzantium

Rachid Guerraoui, El Mahdi El Mhamdi, Sébastien Louis Alexandre Rouault
2018
Conference paper

Abstract

While machine learning is going through an era of celebrated success, concerns have been raised about the vulnerability of its backbone: stochastic gradient descent (SGD). Recent approaches have been proposed to ensure the robustness of distributed SGD against adversarial (Byzantine) workers sending \emph{poisoned} gradients during the training phase. Some of these approaches have been proven \emph{Byzantine--resilient}: they ensure the \emph{convergence} of SGD despite the presence of a minority of adversarial workers. We show in this paper that \emph{convergence is not enough}. In high dimension $d \gg 1$ , an adver-sary can build on the loss function's non--convexity to make SGD converge to \emph{ineffective} models. More precisely, we bring to light that existing Byzantine--resilient schemes leave a \emph{margin of poisoning} of $\bigOmega\left(f(d)\right)$ , where $f(d)$ increases at least like $\sqrt{d}$ . Based on this \emph{leeway}, we build a simple attack, and experimentally show its strong to utmost effectivity on CIFAR--10 and MNIST. We introduce \emph{Bulyan}, and prove it significantly reduces the attacker's leeway to a narrow $\bigO\,( \sfrac{1}{\sqrt{d~}})$ bound. We empirically show that Bulyan does not suffer the fragility of existing aggregation rules and, at a reasonable cost in terms of required batch size, achieves convergence \emph{as if} only non--Byzantine gradients had been used to update the model.

Official source

https://infoscience.epfl.ch/record/256124?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

The Hidden Vulnerability of Distributed Learning in Byzantium

Graph Chatbot

Chat with Graph Search

Random matrix methods for high-dimensional machine learning models

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Understanding generalization and robustness in modern deep learning

Random matrix methods for high-dimensional machine learning models

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Understanding generalization and robustness in modern deep learning