On Smoothed Explanations: Quality and Robustness

Explanation methods highlight the importance of the input features in taking a predictive decision, and represent a solution to increase the transparency and trustworthiness in machine learning and deep neural networks (DNNs). However, explanation methods can be easily manipulated generating misleading explanations particularly under visually imperceptible adversarial perturbations. Recent work has identified the decision surface geometry of DNNs as the main cause of this phenomenon. To make explanation methods more robust against adversarially crafted perturbations, recent research has promoted several smoothing approaches. These approaches smooth either the explanation map or the decision surface.|In this work, we initiate a very thorough evaluation of the quality and robustness of the explanations offered by smoothing approaches. Different properties are evaluated. We present settings in which the smoothed explanations are both better, and worse, than the explanations derived by the commonly-used (non-smoothed) Gradient explanation method. By making the connection with the literature on adversarial attacks, we demonstrate that such smoothed explanations are robust primarily against additive attacks. However, a combination of additive and non-additive attacks can still manipulate these explanations, revealing important shortcomings in their robustness properties.

On Smoothed Explanations: Quality and Robustness

Graph Chatbot

Chattez avec Graph Search

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Robust machine learning for neuroscientific inference

Reduced Training Data for Laser Ultrasound Signal Interpretation by Neural Networks

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Robust machine learning for neuroscientific inference

Reduced Training Data for Laser Ultrasound Signal Interpretation by Neural Networks