Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 13:5:903875.
doi: 10.3389/frai.2022.903875. eCollection 2022.

Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents

Affiliations

Benchmarking Perturbation-Based Saliency Maps for Explaining Atari Agents

Tobias Huber et al. Front Artif Intell. .

Abstract

One of the most prominent methods for explaining the behavior of Deep Reinforcement Learning (DRL) agents is the generation of saliency maps that show how much each pixel attributed to the agents' decision. However, there is no work that computationally evaluates and compares the fidelity of different perturbation-based saliency map approaches specifically for DRL agents. It is particularly challenging to computationally evaluate saliency maps for DRL agents since their decisions are part of an overarching policy, which includes long-term decision making. For instance, the output neurons of value-based DRL algorithms encode both the value of the current state as well as the expected future reward after doing each action in this state. This ambiguity should be considered when evaluating saliency maps for such agents. In this paper, we compare five popular perturbation-based approaches to create saliency maps for DRL agents trained on four different Atari 2,600 games. The approaches are compared using two computational metrics: dependence on the learned parameters of the underlying deep Q-network of the agents (sanity checks) and fidelity to the agents' reasoning (input degradation). During the sanity checks, we found that a popular noise-based saliency map approach for DRL agents shows little dependence on the parameters of the output layer. We demonstrate that this can be fixed by tweaking the algorithm such that it focuses on specific actions instead of the general entropy within the output values. For fidelity, we identify two main factors that influence which saliency map approach should be chosen in which situation. Particular to value-based DRL agents, we show that analyzing the agents' choice of action requires different saliency map approaches than analyzing the agents' state value estimation.

Keywords: deep reinforcement learning; explainable artificial intelligence (XAI); explainable reinforcement learning; feature attribution; interpretable machine learning; saliency maps.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
An example of the different types of perturbation used by the saliency map approaches in our work. The parameters are chosen in such a way that the idea of the perturbation can be easily identified. For Occlusion and Noise, the disturbed area is marked with a red circle.
Figure 2
Figure 2
A schematic representation of the insertion metric curve.
Figure 3
Figure 3
Example saliency maps for games we tested. From top to bottom: Pac-Man, Breakout, Space Invaders, and Frostbite. For a better visibility, the saliency maps are displayed in green color over a simplified version of the states. The higher the intensity of the green color, the higher the relevance of the corresponding pixel for the agent's decision.
Figure 4
Figure 4
Example saliency maps for the parameter randomization sanity check. From top to bottom each row after the first is generated for agents with cascadingly randomized layers starting with the output layer.
Figure 5
Figure 5
Results of the sanity checks for the different saliency map approaches (NS is noise Sensitivity). Measured for 1, 000 states of each of the 4 tested games. Starting from the left, each mark represents an additional randomized layer starting with the output layer. The y-axis shows the average similarity values (Spearman rank correlation, SSIM, Pearson correlation of the HOGs). High values indicate a low parameter dependence. The translucent error bands show the 99% CI but are barely visible due to low variance in the results.
Figure 6
Figure 6
Results of the sanity checks for each individual game for the different saliency map approaches (NS is noise Sensitivity). Measured for 1, 000 states of each of the 4 tested games. Starting from the left, each mark represents an additional randomized layer starting with the output layer. The y-axis shows the average similarity values (Spearman rank correlation, SSIM, Pearson correlation of the HOGs). High values indicate a low parameter dependence. The translucent error bands show the 99% CI.

Similar articles

Cited by

References

    1. Adebayo J., Gilmer J., Muelly M., Goodfellow I., Hardt M., Kim B. (2018). “Sanity checks for saliency maps,” in Advances in Neural Information Processing Systems (Montreal, QC: ), 9505–9515.
    1. Amir D., Amir O. (2018). “HIGHLIGHTS: summarizing agent behavior to people,” in AAMAS (Stockholm: ), 1168–1176.
    1. Ancona M., Ceolini E., Öztireli C., Gross M. (2018). “Towards better understanding of gradient-based attribution methods for deep neural networks,” in ICLR (Vancouver, BC: ).
    1. Anderson A., Dodge J., Sadarangani A., Juozapaitis Z., Newman E., Irvine J., et al. . (2019). “Explaining reinforcement learning to mere mortals: an empirical study,” in IJCAI (Macao: ), 1328–1334.
    1. Atrey A., Clary K., Jensen D. (2020). “Exploratory not explanatory: Counterfactual analysis of saliency maps for deep reinforcement learning,” in ICLR (Addis Ababa: ).

LinkOut - more resources