How to tell if artificial intelligence is working the way we want it to
WHAT IS IT?
Explaining feature attribution methods to show what the AI model pays the most attention to when it makes a prediction, using interpretability methods.
WHY IMPORTANT
The AI models are so enormously complex, even the researchers who design them don’t fully understand how they work. This makes it hard to know whether they are working correctly. With so much uncertainty swirling around these so-called “black-box” models, this puzzle has led to a new and rapidly growing area of study in which researchers develop and test explanation methods (also called interpretability methods) that seek to shed some light on how black-box machine-learning models make predictions.