Setting the record straight on explainable AI: (1st out of N) Which explanation?

4 min readMay 22, 2020

The recent years have seen a surge of work at the interface of AI — or more specifically, machine learning (ML) — and society/humanities. This is a great development, which has (rightly) been welcomed by the ML community, its application domains, and general public. Such works have the potential to lead to a new generation of multilingual ML scientists and engineers, who can speak ML, humanities (e.g., philosophy, ethics, and more), and other disciplines, and pave the way for a broader adoption and success, and appropriate use of this technology. Terms such as fair, transparent, interpretable, explainable, and accountable AI — just to name a few — are example keywords from these lines of research.

In this post, I want to particularly talk about explainable AI (XAI), and from a very specific angle: Which explanation XAI is (and should be) concerned with? As the question implies, there are multiple explanations that come to people’s mind when it comes to an AI model and the system it models; some assume XAI is equally relevant to all of them. Of course, there are different definitions for XAI, which I hope to cover in the next posts; the discussion here will stay relevant across a range of common XAI definitions. Given the importance of avoiding XAI becoming a catch-all phrase, I hope to help clarify what XAI is not, at least.

What are ML models? They are models of some systems/phenomena/processes, trained on the data that are collected from various aspects of such systems, so that they can optimally make predictions or decisions. Many such systems can be viewed at different levels of granularity; that is, there are many upstream mechanisms, whose interactions and aggregation can result in mechanisms at new levels of granularity. For instance, let’s consider some supervised-learning ML problems within a medical context. As this figure shows, one can view medicine/health, starting at the cellular level and going all the way to population-health level and beyond (i.e., zooming in and out). Therefore, the data at times, can come from one or multiple such levels of granularity (what the field refers to as “multi-omics” data). For instance, one can use genetics data to predict the onset of a cancer, or use the social and environmental determinants of health to predict the risk of diabetes. In such cases, a curious mind might desire multiple explanations; they can probably be boiled down to two representative questions:

Q1 (re system’s mechanism): Can the model explain the (upstream) mechanisms underlying the system (e.g., why some people get certain diseases)?
Q2 (re the model’s explanation): Can the model’s I/O process be explained (e.g., how the inputs, representing a particular patient, led to a high/low risk prediction; which input variables played what roles in the prediction)?

I believe that XAI is concerned with Q2, not Q1: An ML model can be said to be explainable if it can explain the relationships it found among the variables it deals with. Of course, ML models trained on the data generated by a system, have the potential to help generate new hypotheses about that system’s underlying mechanisms, and/or provide evidence for or against the existing hypotheses on its underlying mechanisms. However, we cannot say an ML model is not explainable because it cannot explain the mechanism underlying the system from which its training data have come from. Furthermore, this does not mean that we should stop asking questions like Q1; it rather means that an ML model’s inability to mine the downstream mechanisms of a system should not be the reason to rule it out as a viable solution for solving problems related to that system.

Explainable AI should be concerned with explaining the AI model itself (i.e., how a particular combination of input variables leads to a certain prediction) — not necessarily explaining the mechanisms underlying the systems whose data the AI has been trained on.

Lastly, you might have already heard the famous “all models are wrong, but some are useful” expression; a common aphorism in the field of statistics, which makes a broader point about scientific methods and scientific discovery process. It is particularly more true in the earlier days of a line of research. Of course, ML models are no exception and hence making their adoption dependent on ‘them being right or wrong about a mechanism’ vs ‘them being useful for solving a problem’ is an extremely important distinction.

This was the first post in a series that I decided to write on XAI and a range of related topics. Given the vagueness of many definitions, and extreme heterogeneity of viewpoints re this topic, I hope to borrow from the wisdom of the crowd (through your comments, disagreements, and discussions on these posts) and eventually write a more comprehensive paper on this topic, which captures a broader set of perspectives. Please comment, share and move the conversation forward. Of course, these are personal opinions, and do not necessarily reflect the viewpoints of the institutions that I am affiliated with.

Setting the record straight on explainable AI: (1st out of N) Which explanation?

Written by Reza Khorshidi