Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Recent work has shown that state-of-the-art classifiers are quite brittle, in the sense that a small adversarial change of an originally with high confidence correctly classified input leads to a wrong classification again with high confidence. This raises concerns that such classifiers are vulnerable to attacks and calls into question their usage in safety-critical systems. We show in this paper for the first time formal guarantees on the robustness of a classifier by giving instance-specific lower bounds on the norm of the input manipulation required to change the classifier decision. Based on this analysis we propose the Cross-Lipschitz regularization functional. We show that using this form of regularization in kernel methods resp. neural networks improves the robustness of the classifier with no or small loss in prediction performance.

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Graph Chatbot

Chattez avec Graph Search

InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts

Estimating and Improving the Robustness of Attributions in Text

Privacy-preserving federated neural network training and inference

InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts

Estimating and Improving the Robustness of Attributions in Text

Privacy-preserving federated neural network training and inference