Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to analyse Regression Kink Analysis Designs #264

Merged
merged 29 commits into from
Nov 9, 2023
Merged

Conversation

drbenvincent
Copy link
Collaborator

@drbenvincent drbenvincent commented Oct 20, 2023

This is my attempt to implement analysis methods for the Regression Kink Design. Happy to get any feedback on the work so far

  • First draft of notebook/docs
  • Implement RegressionKink class
    • First attempt
    • Add docstring for the RegressionKink class
    • Add tests
    • Refactor. At the moment this whole class is copied/pasted from the RegressionDiscontinuity class, so there is a lot of duplication. The main substantive difference is that we evaluate the function so as to evaluate the change in slopes either side of the kink. Will create an issue about this because it gets into wider refactoring issues.
    • Fix summary method
  • Add entry to Glossary + add reference back to glossary

Closes #223

NOTE: The example notebook does not try to act like an intro/explainer into regression kink designs. It assumes the reader has some background knowledge about what it is and when you'd use it. Happy to get input on whether it should have more introductory info, but there's a Discussion #262 on whether we should add a specific explainer section to the docs which would allow the notebooks to remain concise.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codecov
Copy link

codecov bot commented Oct 20, 2023

Codecov Report

Merging #264 (4386f6b) into main (7ebab10) will increase coverage by 1.50%.
The diff coverage is 85.98%.

@@            Coverage Diff             @@
##             main     #264      +/-   ##
==========================================
+ Coverage   74.41%   75.91%   +1.50%     
==========================================
  Files          20       20              
  Lines        1157     1308     +151     
==========================================
+ Hits          861      993     +132     
- Misses        296      315      +19     
Files Coverage Δ
causalpy/tests/test_input_validation.py 100.00% <100.00%> (ø)
causalpy/tests/test_integration_pymc_examples.py 100.00% <100.00%> (ø)
causalpy/tests/test_pymc_experiments.py 100.00% <100.00%> (ø)
causalpy/pymc_experiments.py 69.11% <73.80%> (+1.32%) ⬆️

@drbenvincent drbenvincent changed the title [WIP] Add Regression Kink Analysis Add Regression Kink Analysis Nov 2, 2023
@drbenvincent drbenvincent changed the title Add Regression Kink Analysis Add ability to analyse Regression Kink Analysis Designs Nov 2, 2023
@drbenvincent drbenvincent added the enhancement New feature or request label Nov 2, 2023
Copy link

review-notebook-app bot commented Nov 5, 2023

View / edit / reply to this conversation on ReviewNB

NathanielF commented on 2023-11-05T10:32:48Z
----------------------------------------------------------------

This is nice. Good demonstration of the idea.


Copy link

review-notebook-app bot commented Nov 5, 2023

View / edit / reply to this conversation on ReviewNB

NathanielF commented on 2023-11-05T10:32:49Z
----------------------------------------------------------------

Add a comment above the code where you specify the betas to indicate that this is where 2 is set as the gradient at kink point.


drbenvincent commented on 2023-11-06T10:25:37Z
----------------------------------------------------------------

Done

Copy link

review-notebook-app bot commented Nov 5, 2023

View / edit / reply to this conversation on ReviewNB

NathanielF commented on 2023-11-05T10:32:49Z
----------------------------------------------------------------

I like this! Didn't realise patsy was so powerful.


Copy link

review-notebook-app bot commented Nov 5, 2023

View / edit / reply to this conversation on ReviewNB

NathanielF commented on 2023-11-05T10:32:50Z
----------------------------------------------------------------

Nice!


Copy link

review-notebook-app bot commented Nov 5, 2023

View / edit / reply to this conversation on ReviewNB

NathanielF commented on 2023-11-05T10:32:51Z
----------------------------------------------------------------

Same as above add comment to the code block specifying where you fix the gradient to be recovered


drbenvincent commented on 2023-11-06T10:52:41Z
----------------------------------------------------------------

As far as I can tell, when we use a quadratic, none of the individual parameters corresponds to the change in gradient at the kink point, so I've relied on calculating that numerically. But I have added a comment in about how the betas translate into different equations on the left and right of the kink point.

Copy link

review-notebook-app bot commented Nov 5, 2023

View / edit / reply to this conversation on ReviewNB

NathanielF commented on 2023-11-05T10:32:51Z
----------------------------------------------------------------

Is the forest plot redundant if you're doing the plor posterior too?


drbenvincent commented on 2023-11-06T10:55:55Z
----------------------------------------------------------------

Yep, I've removed the forest plot

Copy link

review-notebook-app bot commented Nov 5, 2023

View / edit / reply to this conversation on ReviewNB

NathanielF commented on 2023-11-05T10:32:52Z
----------------------------------------------------------------

Maybe be a bit more explicit that this is not a new data set but that you're just showing basis splines applied to the last example and intending to recover the same parameters again.


drbenvincent commented on 2023-11-06T10:59:04Z
----------------------------------------------------------------

done

Copy link

review-notebook-app bot commented Nov 5, 2023

View / edit / reply to this conversation on ReviewNB

NathanielF commented on 2023-11-05T10:32:53Z
----------------------------------------------------------------

Can you round the -1.14 to 1.1 to show the same value is being recoverd in the posterior plot and your more customised timeseries one...?

Just a nit.


drbenvincent commented on 2023-11-06T11:07:33Z
----------------------------------------------------------------

Sure. What I've actually done is to show 2 decimal places in the Arviz plot. At the moment, rounding to 1 decimal place in the plot method would require doing that for everything. So I'm just about to add an issue to add a round_to kwarg to the plot methods in causalpy.

@NathanielF
Copy link
Contributor

Notebook and experiment design look great. Added a few comments. Mostly nits.

Copy link
Contributor

@NathanielF NathanielF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. Feel free to change/address the comments above as you like. Mostly they were small points of clarification. Major work looks great. Cool addition.

{self.running_variable_name: xi, "treated": self._is_treated(xi)}
)
# self.x_pred = pd.DataFrame({self.running_variable_name: xi})
(new_x,) = build_design_matrices([self._x_design_info], self.x_pred)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there. is a reason but why is the output wrapped in braces?and then immediately into an array?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_design_matrices returns both the x and y design matrices as a tuple, so we are just grabbing the x matrix here. And by default these are as dataframes. So you can either provide the return_type='matrix' argument to build_design_matrices, or manually do it yourself with np.asarray as I did here.

y = reg_kink_function(x, beta, kink) + rng.normal(0, sigma, N)
df = pd.DataFrame({"x": x, "y": y, "treated": x >= kink})
# run experiment
result = cp.pymc_experiments.RegressionKink(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of neat that the kink model calls the linear model!



@pytest.mark.integration
def test_rkink_bandwidth():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you have an example in the notebook with the bandwidth parameter? Maybe worth adding/explaining

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

A PyMC model
:param running_variable_name:
The name of the predictor variable that the kink_point is based upon
:param epsilon:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you suggested not having the notebook as a place for explaining the theory of kink designs, but i feel like the differences between tweaking at least one of epsilon/bandwidth parameters could be mentioned or shown.

It's fine if tweaking them isn't needed for your example, but it'd be good to hint at why they are there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an example to demonstrate use of the bandwidth parameter. And I've added an admonition box to explain what epsilon does.

Copy link
Collaborator Author

Done


View entire conversation on ReviewNB

Copy link
Collaborator Author

As far as I can tell, when we use a quadratic, none of the individual parameters corresponds to the change in gradient at the kink point, so I've relied on calculating that numerically. But I have added a comment in about how the betas translate into different equations on the left and right of the kink point.


View entire conversation on ReviewNB

Copy link
Collaborator Author

Yep, I've removed the forest plot


View entire conversation on ReviewNB

Copy link
Collaborator Author

done


View entire conversation on ReviewNB

Copy link
Collaborator Author

Sure. What I've actually done is to show 2 decimal places in the Arviz plot. At the moment, rounding to 1 decimal place in the plot method would require doing that for everything. So I'm just about to add an issue to add a round_to kwarg to the plot methods in causalpy.


View entire conversation on ReviewNB

@drbenvincent
Copy link
Collaborator Author

Thanks for the feedback @NathanielF. I think I've dealt with all the suggestions. I've also added the regression kink designs to the README.md and index.rst (for readthedocs) which I forgot to do before.

Copy link
Collaborator

@juanitorduz juanitorduz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor comments on the core component code style to make the computations more readable (and testable)

causalpy/pymc_experiments.py Outdated Show resolved Hide resolved
causalpy/pymc_experiments.py Show resolved Hide resolved
causalpy/pymc_experiments.py Outdated Show resolved Hide resolved
causalpy/pymc_experiments.py Outdated Show resolved Hide resolved
causalpy/pymc_experiments.py Outdated Show resolved Hide resolved
causalpy/pymc_experiments.py Outdated Show resolved Hide resolved
@drbenvincent
Copy link
Collaborator Author

Thanks for the review @juanitorduz. I've made a stab at modularising the change in gradient calculation code and added some tests. Though I'm not sure if you it hits what you were thinking of.

Copy link
Collaborator

@juanitorduz juanitorduz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much nicer! thanks for the refactor. Letf a single comment :)

causalpy/tests/test_input_validation.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@juanitorduz juanitorduz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition! Thank you @drbenvincent 🙌

@drbenvincent drbenvincent merged commit 24ad46e into main Nov 9, 2023
10 checks passed
@drbenvincent drbenvincent deleted the kink branch November 9, 2023 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ability to analyse Regression Kink Designs
3 participants