causal interpretation of empowerment (TEST 2)
Published:
Edited: [2025-04-29 Tue 10:16]
Note: this is not a real post (site under construction), but the content comes from my notes nonetheless
1. Introduction
Empowerment is a measure of the potential causal influence that an actuator has on a sensor for a particular agent. A prerequisite for computing empowerment is to have identified the causal channel of interest, i.e., the one for which there is a causal information flow. In other words, if one is given the joint probability distribution \(p(X, Y)\) but does not know which of the two variables is an actuator and which one is a sensor, then empowerment cannot be computed because the relevant (and actual) causal channel is unknown.
By buildling on the work of [1], [2] seem to suggest that to identify the causal channel of interest one has to probe the observational distribution \(p(X, Y)\) by intervening on \(X\) (see [2, p. 10]). It is not entirely clear what they mean here:
- One interpretation is that to identify the actuator-sensor causal path one has to perform an intervention on one of the two variables and assess whether there is a causal effect on the other. This could be understood as a rough way of performing causal discovery in the bivariate case (see [3]).
- Another interpretation is that, assuming that \(X\) stands for the actuator, one has to perform an atomic intervention on it to make sure that the causal information flow from the actuator to the sensor is isolated from all the other information flows that could be present, e.g., via the parents of \(X\) in a hypothetical causal graphical model representing the perception-action loop of an agent (this seems the most likely interpretation).
(1) and (2) are two distinct problems that should be both solved if we want to compute empowerment. (2) seems the more likely interpretation of what the authors had in mind. In fact, they go on to say that the causal information flow is defined as the mutual information between \(X\) and \(Y\) with respect to the interventional probability distribution \(p(Y|\text{do}(X = x))p(\text{do}(X = x))\) (the factorized joint with the atomic intervention on \(X\)). Considering the interventional distribution in the computation of the mutual information is supposed to capture all the causal information flow from \(X\) to \(Y\) and only that.
2. The Main Bit
Crucially, the definition of empowerment requires something more, namely, that we pick an intervention such that the causal information flow from \(X\) to \(Y\) is maximal. In other words, empowerment is the potential maximal information flow that could be induced by an appropriate intervention on \(X\) (equivalently, by an appropriate choice of interventional distribution). This quantity is nothing else than the channel capacity between the two variables.
However, the treatment of [2] becomes once again confusing because there is no consideration of the distinction between atomic and stochastic interventions. When they discuss the individuation of the causal information channel, they seem to invoke the concept of atomic intervention. Afterwards, when they discuss empowerment, the intervention notation disappears, however, since they are considering a maximum over the they seem to assume that the probability distributions over the action variable are interventional distribution:
\[ \C(X \rightarrow Y) = \max_{p(\text{do}(X = x))} I(Y; X), \]
Once the correct causal information flow has been identified, the most likely interpretation is…