Original Source Here
Together with my team at Understandable Machine Intelligence Lab, we therefore, started to brainstorm ideas that could enhance existing explanation methods. To this end, a simple method called SmoothGrad has established itself among both researchers and practitioners. In a nutshell, it works by creating multiple noisy versions of an input (typically Gaussian noise) and then averaging the explanations of the many inputs, to finally create one single explanation outcome. Although it is very simple, it has been reported to make gradient-based attribution methods less visually diffuse and more robust against adversarial attacks. Given this, we came to ask:
In the same fashion that SmoothGrad enhances explanations by exploring the neighbourhood of a given input, can we further improve our explanations by exploring the neighbourhood of a given model?
Similar to how ensemble learning works to improve the generalisation performance of machine learning models (by averaging predictions from multiple classifiers), we wanted to create a similar “wisdom of the crowd” scenario where our explanations are based on the collective (expert opinion) of several models, rather than relying on a single model (one expert opinion). This further motivated us to explore another way of using stochasticity: while forming an explanation — instead of adding noise to the input data, we add noise to the weights of a neural network.
Methods. In thelight of this, we came up with two methods: NoiseGrad and NoiseGrad++. These are stochastic, method-agnostic explanation-enhancing methods that add multiplicative Gaussian noise to the weights instead of (only) to the input data.
where E is an explanation function that attributes relevances to the features of the input x with respect to the neural network f(·, W_i), ξj represents noise added to the input x and N, M is the number of noisy models and -input samples, respectively. Here, W_i represents one specific network sample drawn from the Bayes posterior.
Since we know that approximating the posterior distribution of a neural network is computationally expensive (most methods require full retraining of the network), in the spirit of MC dropout, we therefore mainly focus on approximating the posterior with multiplicative Gaussian noise. Theoretically, NoiseGrad as such can be seen as performing a quite crude Laplace approximation. Yet, despite the simplicity of adding the same amount of multiplicative noise to all weights in the network, in our experiments, we observed that it still creates a sufficiently accurate approximation in order to get an insight into the uncertainty of the model as to enhance explanations.
Results. Early on, we could observe that adding noise to a neural network’s parameters had a positive effect on explanations:
Although evaluation of XAI is still a largely unsolved problem (!) in experiments, we could observe that explanations become more faithful, robust and localized to the object of interest, when our methods are used.
Hyperparameter tuning. While applying our proposed explanation-enhancing methods, a question may pop up: exactly how much noise should be added to the model weights? Do we need to adjust the noise level depending on the given model architecture or the dataset?
In our paper, to help the user set an advantageous level of noise (more noise is not always better!) we put forward a simple hypothesis: since we need signals from the models whose decision boundary is close to the test samples, we might choose the noise level σ such that we observe a certain accuracy drop. As such, we developed a simple heuristic for tuning the noise level of NG and NG++ that works for various DNN architectures: add noise to the model weights until you observe an accuracy drop of 5% comapred to the original test accuracy.
From experimental results, we found that setting the relative accuracy drop AD(σ) = 1−(ACC(σ)−ACC(∞))/(ACC(0)−ACC(∞)) to around 5% works for tested DNN architectures, including LeNet, VGGs and ResNets. Here, ACC(σ) denotes the classification accuracy at the noise level σ. Note that ACC(0) and ACC(∞) correspond to the original accuracy and the chance level, respectively.
Final thoughts. It is important to remember that any explanation will only be as good (or robust) as the model it is trying to explain — needless to say, a XAI method can never be disentangled from its model (nor its deficiencies). Before you decide to put trust in a XAI method for decision-making, proper model testing is crucial.
That being said, it is fascinating how such simple techniques i.e., NoiseGrad and NoiseGrad++ actually can bring about more faithful, robust and localised explanations. Of course, it remains to further investigate the performance of our proposed methods on other tasks than image classification such as time-series prediction or NLP. In the future, we are also interested in quantitatively validating to what extent localisation as an attribution quality criteria is useful on natural datasets (we used semi-natural datasets in our paper to control the placement of attributional evidence for “ground truth”). With these final words, one last illustration:
Code and examples
Read our arXiv pre-print
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot