Skip to content

Conversation

@NathanielF
Copy link
Contributor

@NathanielF NathanielF commented Oct 23, 2025

Adding a notebook to the knowledge base on structural causal modelling with Bayesian models.

I think it will be useful to highlight what is unique and special about bayesian causal modelling and lean into the structural causal modelling perspective. This would distinguish CausalPy more as a Bayesian package with a unique selling point. In the notebook i demonstrate joint causal models of outcome and treatment and will discuss links and contrast with the potential outcome framework.

image

Along the way i demonstrate the usefulness of spike and slab priors for improving identification with a species of joint causal model for both continuous and binary treatments. These could be added as distinct models or just as options for priors in existing models. I'm not entirely sure where to take this modelling and would like to discuss it when the draft is more complete and ready for review.

Questions for discussion.

  • Structural causal Modelling is present in CausalPy via the IV modelling implementation. But i wonder if we want to make the implementation more robust following some the details in this notebook e.g. variable selection priors or the explicit modelling of rho
  • CATE inference with BART treatment models are shown here
  • General thoughts about SCM and the link between potential outcomes and the way we gloss the relationship between these topics in CausalPy.

📚 Documentation preview 📚: https://causalpy--537.org.readthedocs.build/en/537/

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@codecov
Copy link

codecov bot commented Oct 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.59%. Comparing base (9a88b01) to head (eb05301).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #537      +/-   ##
==========================================
+ Coverage   95.48%   95.59%   +0.11%     
==========================================
  Files          29       29              
  Lines        2615     2681      +66     
==========================================
+ Hits         2497     2563      +66     
  Misses        118      118              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF NathanielF marked this pull request as ready for review October 28, 2025 11:01
Copy link
Collaborator

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will review the knowledge base article now. But can we just check correctness of pymc-bart (with hypen) to environment.yml and the import being import pymc_bart as pmb.

I'm actually thinking that if this is only used in docs, and because the docs are not built from scratch remotely, maybe we don't need this update to environment.yml?

Copy link
Collaborator

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First-pass review. I like it. Here's a mix of high and low level thoughts at this point...

I'm wondering if we want to use the language of 'model selection' a bit more, especially in the introduction. Would it also be useful and instructive to discuss Savage-Dickey where we have a 'super model' and certain parameters (e.g. being zero) structurally alter the model so that we are effectively exploring different models via the parameter values.

I'd maybe add a legend under the diggity image and take some time to relate that back to the data simulation code. Basically use it as a breather/pause to allow the reader to keep up with the details. [Not sure if we can do a legend, so feel free to ignore the formatting point, but I think some more explanation would help readers.]

We can maybe add the hide-input cell tag on cells that are exclusively for plot generation? That's just a personal preference for docs.

Pedantic and python stylistic point, but fit_models could be made cleaner with a dictionary comprehension.

Can we get references in there for each of the prior types. Academic papers and/or useful blogposts. Maybe a brief description of each for those not familiar.

Not sure "reduced-form" is defined anywhere?

Can you be more specific with "schizphrenic posterior distribution"

Sorry if this is annoying, but can you add a glossary term (in glossary.rst) for "probabilistic programming"

Something iffy going on with formatting here: "Y_bin ~ T_bin + feature_0 + feature_1 + feature_2 + feature_3 + feature_4 + feature_5 + feature_6 + feature_7 +feature_8",

If there are any book chapters in Bayesian stats books which touch on this topic, it would be good to add them in as references.

Link or reference for the credibility revolution?

I think there's scope for a summary directed at practitioners. Are you advocating for the general practice of using these kinds of priors for sparsifying model structure? If so, are there concrete situations where that can be used, and how could a skeptic be convinced of it's utility above and beyond this being a form of basic feature selection when you throw all your variables into the causal soup?

Typos

  • empircal

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF
Copy link
Contributor Author

Thanks @drbenvincent all fair points. Will review each.

@NathanielF
Copy link
Contributor Author

On the BART stuff, yeah if the PR is just this knowledge base article. Maybe not needed in environment... but if we wanted to add CATE functionality it might be a good default for flexible approximation with a bayesian flavour

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF
Copy link
Contributor Author

NathanielF commented Oct 29, 2025

I'm wondering if we want to use the language of 'model selection' a bit more, especially in the introduction. Would it also be useful and instructive to discuss Savage-Dickey where we have a 'super model' and certain parameters (e.g. being zero) structurally alter the model so that we are effectively exploring different models via the parameter values.

I've added model comparison assessments via the LOO metric, but on the performance scores the models aren't really distinguished much. I prefer the focus on the model comparison as a sensitivity check so i've stated that here too. But i've added Savage Dickey plots to the NHEFS example comparing against the null of the OLS estimate.

I'd maybe add a legend under the diggity image and take some time to relate that back to the data simulation code. Basically use it as a breather/pause to allow the reader to keep up with the details. [Not sure if we can do a legend, so feel free to ignore the formatting point, but I think some more explanation would help readers.]

I've added an explanation.

We can maybe add the hide-input cell tag on cells that are exclusively for plot generation? That's just a personal preference for docs.

Hidden most if not all plotting logic.

Not sure "reduced-form" is defined anywhere?

Can you be more specific with "schizphrenic posterior distribution"

Added clarification on both points

Sorry if this is annoying, but can you add a glossary term (in glossary.rst) for "probabilistic programming"

done

Something iffy going on with formatting here: "Y_bin ~ T_bin + feature_0 + feature_1 + feature_2 + feature_3 + feature_4 + feature_5 + feature_6 + feature_7 +feature_8",

Fixed.

Added a bunch of references.

@NathanielF
Copy link
Contributor Author

I think there's scope for a summary directed at practitioners. Are you advocating for the general practice of using these kinds of priors for sparsifying model structure? If so, are there concrete situations where that can be used, and how could a skeptic be convinced of it's utility above and beyond this being a form of basic feature selection when you throw all your variables into the causal soup?

I was primarily thinking of these as a kind of sensitivity analysis tool for cases with lots of covariates. Not as a magical salve to replace theory driven instrument selection. Re: summary directed at practitioners i can add one at the end before the conclusion, but i'll keep it brief.

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF
Copy link
Contributor Author

For point of discussion I think after this PR i'd propose to add a few issues around

  1. The possible inclusion of a small utility function or class to add variable selection priors to causalpy that can be used across various model classes

  2. Strategise about how to incorporate CATE estimation routines. Here I've used a very simple interaction model. But we should have a more general approach....

  3. Potential plan to augment the existing IV model with more explicit parameterisation of e.g. covariance structure and functionality for CATE and variable selection priors.

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants