Setting the record straight on explainable AI : (3rd out of N) Is linear regression really “explainable”?

6 min readJul 21, 2021

According to Oxford Dictionary, explain /ɪkˈspleɪn/ is a verb, which means “make an idea or situation clear to someone by describing it in more detail or revealing relevant facts”. Assuming that the explanation in XAI (explainable AI) is the same, I think that two key elements of this definitions (i.e., “someone” and “relevant facts/details”) are usually overlooked in XAI debates. Furthermore, the criteria for success for such explanations are not discussed as objectively (and as much) as needed; these led to many different (and at times, misaligned) views among researchers in various subfields of AI and related social sciences (as well as stakeholders). Therefore, in this post, I would like to discuss the common narrative that goes like: “Modern ML/DL models are black box and the field needs to do more to make them explainable; linear regression models, on the other hand, are explainable (glass box).”

The “relevant facts/details”

The facts that are relevant to explaining ML models can range from the data they have been trained on, and their performance, to their inner workings, and the ways users interact with them. Given that some of such relevant details have other terms that they fall under (e.g., transparency, UX design, and so on), in this article I will focus on explaining ML models’ learning/training (i.e., how the final model, f(x), is achieved) and prediction/inference (i.e., how the final model, f(x), maps input data, x, to predictions, y).

Two key components of ML pipelines are training/learning (of the models) and inference/prediction (using the models); more on this can be found here. In the training phase, a model uses the examples it is shown (i.e., data), to learn the mapping from an input to an output (through fine-tuning itself to make smaller/less errors as it is shown more and more examples); in the inference phase (which can be an actual live/production setting), the model uses that learning to predict the output for a new example/input.

The need for explanation

Cooperation between models and humans depends on trust; explainability and interpretability are commonly thought of as post hoc solutions to our inability to incorporate trust in ML’s learning and optimisation (a common challenge when dealing with a computationally-irreducible context). Explainability is expected to provide a range of stakeholders with the ability to: Carry out model due diligence; learn the explanation and repeat it for others (i.e., memorising/internalising/understanding how model works); collaborate with the model (a la human-in-the-loop approaches, for instance) and decide when to trust them and when to discount them; implement, reproduce and replicate the models; generate upstream hypotheses; and so on.

Linear regressions

Linear regression, models the relationship between a scalar response and one or more explanatory variables, through mapping a group of numbers (i.e., x) to a single number (i.e., y) in a linear fashion (that is, y = f(x) [ + noise]). For the rest of this article, let’s consider ordinary least squares (OLS) regression (which is probably the simplest of such models) in a hypothetical medical task: Predicting the change in blood pressure (Y) due to a certain treatment, using total cholesterol (X1), BMI (X2), and resting heart rate (X3).

OLS is one of the simplest models that ML community can use when mapping input vectors to scalar outputs. In this hypothetical example, we have data from multiple individuals blood pressure treatments; for each person, we know their total cholesterol, BMI, and resting heart rate, and the outcome from the blood pressure treatment (i.e., a dataset consisting of four vectors, each with length equal to the number of people in our study). Training an OLS model means learning the best a values (i.e., [a0, a1, a2, a3]); inference in OLS means using the learned a values to map any 3-number input vector from one person to (i.e., [x1, x2, x3]) to her predicted treatment effect (i.e., y).

Explaining the learning/training

Explaining the training matters, as it leads to trust in the final model; one will find it difficult to trust the final model, if he/she cannot trust the training process (the maths, the assumptions, and other training details). Furthermore, building such models without understanding their learning process, is likely to violate the conditions under which they are supposed to operate/can be used.

When one hears “we trained an OLS model ” for mapping X’s to Y, it implies sentences such as “we assumed the relationship between the variables is linear”; “we assumed that the residuals are iid and normally distributed”; “there is homoscedasticity and no autocorrelation”; “the parameters learned so that they minimise sum of squared residuals”; “we used the projection technique to arrive at the parameters”; “maximum likelihood was used to estimate the parameters”; “generalised method of moments was employed”; “after the training, the diagnostic plots showed that …”; and more. You see where this is going? As far as a scientist is concerned, these are the types of sentences that will lead to the explanation — which in turn will help trust, due diligence, recreate, … the model. In other words, a meaningful explanation of the learning process will be a technical one; otherwise, it will be difficult for it to serve any purpose.

This is the case for all statistical ML — including decision trees, which are another group of very simple learning models — and deep learning (DL) models. While we can try to provide some clues (and partial explanations) to everyone about the learning process (which to me is more about the UX and sales), explaining anything specific (which is towards serving the aforementioned objectives) will inevitably enter the world of technical language.

Explaining the inference/prediction

Explaining the prediction logic is about explaining f(x), or the input-output mapping (as shown in the Figure above). Explaining the function itself, is probably the simplest of ML explanations (i.e., just a number of simple mathematical operations such as sums and multiplications applied to the input). However, the audience’s ability to remember/internalise such explanations will drop as the input dimensionally of the input grows (i.e., explanation will entail too many summations, multiplications, and so on).

In another set of scenarios, the audience aims to use such explanations for model due diligence, trust, and understanding how to collaborate with them. A common approach here is based on the interpretation of the coefficients. For instance, in our linear regression example here, one can say “if X1 goes up one unit, Y goes up a1 units, if X2 and X3 don’t change”; that is, one’s cholesterol going up, without any change in BMI or resting heart rate; possible, but extremely rare. I think this is where a common explanation/interpretation mistake happens: Forgetting the “if X2 and X3 don’t change” part. In other words, the common covariance among input variables, challenges the utility of such interpretation approaches and can guide the audience to the wrong conclusions, and eventual surprises.

Conclusion

To recap:

The expectation from ML community to explain their models is perfectly understandable.
The useful explanations (for due diligence, replication, internalising, communicating to others, upstream hypothesis generation, and so on) are very difficult (and in some cases, impossible) to communicate to those without the necessary technical knowledge. I think it is important to separate such explanations from the types are communicated in sales, and high-level stakeholder conversations, for instance.
Explaining the learning, makes the community face a trade off between oversimplification (which is likely to make the explanation less useful for what it is meant to serve, e.g., due diligence), and correct/actual explanation (which is hard for many to understand).
Explaining the prediction logic, for the purpose of understanding the mapping of inputs to outputs, is probably the easiest of such explanations. However, as the dimensionality of the input grows, such explanations will start to become difficult to internalise/memorise and hence loose their value for the audience (for distillation, interpretation, trust, …). The wholistic/summary interpretations (such as coefficients in OLS regression) have the risk of being misunderstood, despite being so simple to verbalise.
Despite such difficulties, however, the ultimate objective here (i.e., an accurate predictive model that can be trusted) should be everyone’s aspiration. A potential solution can come from other domains such as medicine, which solved this trust problem through a combination of trust in people, institutions, and processes that are carried out by specialists for assessing products and their underlying technologies. Another solution can emerge from up-skilling the stake holders and interested parties.

TL;DR: I think OLS’s prediction logic is easy to explain (hence “explainable”), when operating in low-dimensional spaces; otherwise, neither training, nor inference for OLS qualify as “explainable” to nontechnical audience. Nevertheless, this should not stop researchers from sharing insights (e.g., summary explanations regarding the coefficients) about their models to gain users’/stakeholders’ trust, as long as they are aware of and minimise/manage the risks associated with misunderstanding and wrong interpretations by their users.

This is the 7th article in a series of posts that I wrote about the development of AI-first products (and AI-first digital transformation) — challenges, opportunities, and more. I would love to hear your thoughts, and any learning and experiences that you and your company might have had in this space. Get in touch! Of course, this is a personal opinion and does not necessarily reflect the viewpoints of AIG or the University of Oxford.