ketan agrawal

causal inference
Last modified on May 09, 2022

observation

Links to “observation”

machine learning

Supervised learning is an example of observational inference – we’re just looking for associations between variables \(X\) and \(Y\). Aka, we’re just learning \(P(Y|X)\).

I feel like this thread captures a really interesting divide / contrast of philosophies in machine learning research:

My goal now is to deeply understand the issues at hand in this thread. I found his mention of factor graphs in the shift to reasoning and planning AI was thought-provoking. I feel that causality and factor graphs and Bayesian and all that are very important. I just don’t know quite enough to put the pieces together yet.

intervention

Links to “intervention”

ablation studies

Ablation studies are effectively using interventions (removing parts of your system) to reveal the underlying causal structure of your system. Francois Chollet (creator of Keras) writes about this being useful in a machine learning context:

Nancy Kanwisher (A roadmap for research > Causal role?)

fMRI is nice…but what’s the causal role of these regions? We don’t just want correlations of brain activations & activities.

We need experiments where we “poke” part of the system – intervention. Schalk et al.

If he’s looking at a face, the face changes. If he’s looking at something else, it adds a face to that object.

“Poking the face area” results in weird, weird face stuff happening to brain patient

Stimulated color regions – he saw a rainbow (wtfff)

Towards Causal Representation Learning (Independent mechanisms)

Hypothesis: We can explain the world by the composition of informationally independent pieces/modules/mechanisms. (Note: not statistically independent, but independent s.t. any causal intervention would affect just one such mechanism.)

Towards Causal Representation Learning (Causal induction from interventional data)

How to handle unknown intervention? infer it.

counterfactual

Links to “counterfactual”

Counterfactual Generative Networks

Neural networks like to “cheat” by using simple correlations that fail to generalize. E.g., image classifiers can learn spurious correlations with texture in the background, rather than the actual object’s shape; a classifier might learn that “green grass background” => “cow classification.”

This work decomposes the image generation process into three independent causal mechanisms – shape, texture, and background. Thus, one can generate “counterfactual images” to improve OOD robustness, e.g. by placing a cow on a swimming pool background. Related: generative models counterfactuals

Counterfactual Generative Networks

Neural networks like to “cheat” by using simple correlations that fail to generalize. E.g., image classifiers can learn spurious correlations with texture in the background, rather than the actual object’s shape; a classifier might learn that “green grass background” => “cow classification.”

This work decomposes the image generation process into three independent causal mechanisms – shape, texture, and background. Thus, one can generate “counterfactual images” to improve OOD robustness, e.g. by placing a cow on a swimming pool background. Related: generative models counterfactuals