Increase docstring coverage and add doctests #232

jpreszler · 2023-08-24T00:06:14Z

This addresses issue #129:

All methods in the api documentation have docstrings with working examples. Documentation was checked locally.
Interrogate has 97% coverage - only things missing are file level docstrings on some test files.
Interrogate failure threshold moved up to 85%

Potential future work:

some docstrings are still minimal, so they could be expanded - such as examples of data simulation functions
various plot methods don't have output examples in the API reference, links to static images could be added

Main items for review:

wording of documentation
consistency
docstring style

jpreszler · 2023-08-24T00:09:46Z

causalpy/pymc_experiments.py

@@ -70,6 +175,8 @@ def __init__(
        self._input_validation(data, treatment_time)

        self.treatment_time = treatment_time
+        # set experiment type - usually done in subclasses
+        self.expt_type = "Pre-Post Fit"


This is the only code change. The experiment type had to be added here in order for the summary method to be run on an instance of the PrePostFit class (rather than the SyntheticControl or InterruptedTimeSeries classes).

drbenvincent

Firstly, thanks so much for this @jpreszler. This is amazing work. Timing wise I wanted to do a relatively quick pass to keep the ball rolling. But I've not done a thorough read of all the parameter explanations in the docstrings for example, so I will almost certainly have a few minor points about that before we merge.

One thought I had about the docstring examples was about the maintainability. I've not used this yet, but my feeling is that we will probably avoid major headaches in the future if we test these docstring examples with doctest (e.g. https://realpython.com/python-doctest/; https://docs.python.org/3/library/doctest.html). That may also require changes to the GitHub actions so that these tests are run locally and remotely. If we run into trouble there, then @lucianopaz may be able to give you pointers.

There's a comment in there about when/if we include text output in the docstring examples. I've noted down my first thought, but I'd definitely appreciate input from @juanitorduz and @lucianopaz on this. As I say, I've not used doctest, so it might be that we do need to include (some) text output.

PS: We're likely to merge #213 very soon, so there will be a couple of new classes added there.

I've rendered the docs locally with your changes and they look really great. This is an excellent contribution.

causalpy/skl_experiments.py

causalpy/pymc_models.py

codecov · 2023-08-24T10:27:11Z

Codecov Report

Merging #232 (c80d78e) into main (d7a12cb) will increase coverage by 0.02%.
The diff coverage is 85.71%.

@@            Coverage Diff             @@
##             main     #232      +/-   ##
==========================================
+ Coverage   73.86%   73.89%   +0.02%     
==========================================
  Files          19       19              
  Lines        1148     1149       +1     
==========================================
+ Hits          848      849       +1     
  Misses        300      300

Files Changed	Coverage Δ
causalpy/data/datasets.py	`92.30% <ø> (ø)`
causalpy/data/simulate_data.py	`0.00% <ø> (ø)`
causalpy/plot_utils.py	`60.00% <ø> (ø)`
causalpy/pymc_models.py	`100.00% <ø> (ø)`
causalpy/skl_experiments.py	`66.86% <ø> (ø)`
causalpy/skl_models.py	`100.00% <ø> (ø)`
causalpy/tests/conftest.py	`100.00% <ø> (ø)`
causalpy/tests/test_data_loading.py	`100.00% <ø> (ø)`
causalpy/tests/test_input_validation.py	`100.00% <ø> (ø)`
causalpy/tests/test_integration_pymc_examples.py	`100.00% <ø> (ø)`
... and 7 more

twiecki · 2023-08-24T11:11:48Z

Wow, this is amazing -- much appreciated @jpreszler!

jpreszler · 2023-09-04T20:32:18Z

causalpy/pymc_experiments.py

        percentiles = self.causal_impact.quantile([0.03, 1 - 0.03]).values
-        ci = r"$CI_{94\%}$" + f"[{percentiles[0]:.2f}, {percentiles[1]:.2f}]"
+        ci = "$CI_{94%}$" + f"[{percentiles[0]:.2f}, {percentiles[1]:.2f}]"


I had to remove this \ for linting. With it, the docstring output fails flake8 and doctest issues a deprecation warning for an invalid escape sequence. Is this needed for another reason - such as being able to put these strings into LaTEX documents?

jpreszler · 2023-09-04T20:32:59Z

causalpy/pymc_experiments.py

        percentiles = self.causal_impact.quantile([0.03, 1 - 0.03]).values
-        ci = r"$CI_{94\%}$" + f"[{percentiles[0]:.2f}, {percentiles[1]:.2f}]"
+        ci = r"$CI_{94%}$" + f"[{percentiles[0]:.2f}, {percentiles[1]:.2f}]"


Same as in the DiD experiment summary above.

jpreszler · 2023-09-04T23:21:05Z

I've cleaned up the example output, added doctest to the CI workflow and all examples pass locally. I've also added examples for the new IV model and experiment added in #213 .

This is ready for a new review @drbenvincent when you have a chance.

drbenvincent · 2023-09-05T13:25:41Z

Hi @jpreszler. Thanks for the updates. I'll try to carve out some time (I now have an 11 day old son now!) to review properly. But in the mean time I triggered the remote checks and it looks like we've got a failure. The test output looks like it could be just a missing import of statsmodels.

jpreszler · 2023-09-05T18:28:03Z

@drbenvincent , it looks like statsmodels wasn't being installed into the remote environment so I added it to the dependencies. That should fix the test failure.

Congratulations on the baby boy!

jpreszler · 2023-09-05T18:39:14Z

Looks like there's a little instability in the summary output that I'm looking into fixing.

…ts and some pymc models

jpreszler · 2023-09-06T20:08:53Z

I should have all the instability addressed, the doctests passed in the remote environment on my fork as well as locally.

drbenvincent

Great stuff. I did a quick pass and left a few more change requests. It's really coming along.

causalpy/pymc_models.py

causalpy/tests/test_integration_pymc_examples.py

jpreszler · 2023-09-10T20:30:59Z

@drbenvincent Thanks for the helpful comments. I've made the improvements so this is ready for another look when you have time.

drbenvincent

This is really getting there! Not far to go I think.

Sorry for the iterative nature of this, but having had time to stare at this for a while, I've had some thoughts.

I think the docstring examples are great. But perhaps we've overdone it. I feel that one example for each class makes a lot of sense, but examples for accessing properties (like result.idata) and some/all methods (e.g. result.plot()) are overkill. My feeling is that these are relatively self explanatory and we can perhaps leave it to the example notebooks for users to see more complete worked examples in its full context.

I'm open to counter-arguments, but if you agree then let's just keep one core example per class.

causalpy/data/simulate_data.py

causalpy/skl_experiments.py

causalpy/pymc_models.py

jpreszler · 2023-09-12T18:06:34Z

I think there's definitely a bit of redundancy in the examples, and with doctest that adds a lot of time to running tests.

I've reduced the redundancy and moved all meaningful examples (like summary() calls) to a single example for each class and removed plot and a few other low value examples.

drbenvincent

Great stuff. I just noticed 2 minor issues in pymc_models.py:

Under the LinearRegression, there seems to be something going wrong with the note admonition where it says Generally, the .fit()method should be used rather than calling.build_model() directly. This could just be an issue with my local build of the docs, but if it's an issue for you also, let's see if we can fix it.
Under the InstrumentalVariableRegression class, the :code:priors = {"mus":...has a line break so it's separate from the:param priors:` block.

I'm sure there are a bunch of other small errors that we might find or slight improvements, but I'm happy to merge after these minor updates :)

jpreszler · 2023-09-13T23:27:33Z

Those issues were not just local to you. I've fixed them and looked for other problems, but didn't spot much besides the same issue in WeightedSumFitter as in LinearRegression.

The tests have also passed on my fork after some small adjustments.

drbenvincent · 2023-09-14T18:21:56Z

Quick question as I've never used doctest before...

As far as I can tell doctests are run with pytest --doctest-modules causalpy/. At the moment I can only see these being triggered with the remote tests. Do you think it makes sense to add brief instructions to CONTRIBUTING.md either under "Pull request checklist" or "Building the documentation locally" to tell people that they should run the doctests locally (and how to do that) before a PR.

jpreszler · 2023-09-14T21:26:11Z

Good call. The same command can run all doctests locally, but I also added a make doctest command to the makefile. I've also added some details to the contributing doc about running all doctests or individual ones. This might be too much for the PR checklist.

This is my first venture into doctests, but the system is pretty straightforward. A good thing to note is that the +NUMBER option that is used by a number of the tests is a pytest extension of doctest. If you use doctest directly (not through pytest) these tests will fail.

drbenvincent · 2023-09-14T21:41:08Z

This is great. When running the doctests I just noticed that it produces new files, ancova_data.csv and regression_discontinuity.csv. We should ideally either not produce those files, or have some clean-up.

jpreszler · 2023-09-14T21:47:47Z

That's from a few of the doctests for the simulated data, I've skipped the lines that write out the csvs, but leaving the example of how to do so.

drbenvincent

Approving. Thanks for the contribution! The first of many perhaps :)

jpreszler commented Aug 24, 2023

View reviewed changes

drbenvincent requested changes Aug 24, 2023

View reviewed changes

causalpy/skl_experiments.py Outdated Show resolved Hide resolved

causalpy/pymc_models.py Outdated Show resolved Hide resolved

causalpy/pymc_models.py Outdated Show resolved Hide resolved

drbenvincent requested review from juanitorduz and lucianopaz August 24, 2023 10:24

drbenvincent added the documentation Improvements or additions to documentation label Aug 24, 2023

jpreszler commented Sep 4, 2023

View reviewed changes

jpreszler force-pushed the issue_129_docstring_additions branch from 91ccbd9 to ef10948 Compare September 4, 2023 23:13

jpreszler requested a review from drbenvincent September 4, 2023 23:15

Jason Preszler and others added 12 commits September 6, 2023 10:00

Issue 129: increase docstring coverage, now at 86%

6bee429

Examples for pymc experiments and most models

809277e

increased interrogate threshold, just need examples in skl experiemen…

fd92613

…ts and some pymc models

pymc models and experiments done and docs checked

8a9800f

Finished docstrings and checked documentation results

110505a

small fixes related to pr comments

adb94c5

dcotest on pymc_models good except rng caused score differences

5c2870e

doctest good on pymc experiments except for summaries

2220ed0

doctest clean and added to github actions

5d6c8eb

Added IV examples after rebasing

e392335

add statsmodels to dependencies

76d970a

increase draws and decrease precision on summaries

b5e6dc6

jpreszler force-pushed the issue_129_docstring_additions branch from 1f798f9 to b5e6dc6 Compare September 6, 2023 17:01

jpreszler added 2 commits September 6, 2023 10:43

more precision reduction

2359721

remove unneeded ellipsis options

7ea2c11

drbenvincent requested changes Sep 8, 2023

View reviewed changes

jpreszler added 2 commits September 9, 2023 09:57

fix formula rendering

11aa925

fixed link and removed awkward round call

53f0fd0

jpreszler requested a review from drbenvincent September 10, 2023 20:31

fix minor formatting error

df5ae62

drbenvincent requested changes Sep 11, 2023

View reviewed changes

remove redundencies, clean up some wording

8cc283e

jpreszler requested a review from drbenvincent September 12, 2023 18:06

drbenvincent requested changes Sep 13, 2023

View reviewed changes

jpreszler added 2 commits September 13, 2023 15:42

fix test failure and doc anomalies

8a28509

reduce precision on reg. discont.

e2760bc

jpreszler requested a review from drbenvincent September 13, 2023 23:27

add make doctest and instructions to contributing

34d795d

turn down precision on PrePostFit doctest

f37d3b3

skip doctests that write simulated data to csv files

c80d78e

drbenvincent approved these changes Sep 15, 2023

View reviewed changes

drbenvincent changed the title ~~Issue 129: increase docstring coverage~~ Increase docstring coverage and add doctests Sep 15, 2023

drbenvincent merged commit 234a0cd into pymc-labs:main Sep 15, 2023

This was referenced Sep 15, 2023

Improve math rendering of models in the docs #238

Closed

Improvements to API documentation #62

Closed

drbenvincent mentioned this pull request Oct 18, 2023

increase docstring coverage #129

Closed

2 tasks

Increase docstring coverage and add doctests #232

Increase docstring coverage and add doctests #232

Uh oh!

Conversation

jpreszler commented Aug 24, 2023

Uh oh!

jpreszler Aug 24, 2023

Choose a reason for hiding this comment

Uh oh!

drbenvincent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Aug 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

twiecki commented Aug 24, 2023

Uh oh!

jpreszler Sep 4, 2023

Choose a reason for hiding this comment

Uh oh!

jpreszler Sep 4, 2023

Choose a reason for hiding this comment

Uh oh!

jpreszler commented Sep 4, 2023

Uh oh!

drbenvincent commented Sep 5, 2023

Uh oh!

jpreszler commented Sep 5, 2023

Uh oh!

jpreszler commented Sep 5, 2023

Uh oh!

jpreszler commented Sep 6, 2023

Uh oh!

drbenvincent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jpreszler commented Sep 10, 2023

Uh oh!

drbenvincent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jpreszler commented Sep 12, 2023

Uh oh!

drbenvincent left a comment

Choose a reason for hiding this comment

Uh oh!

jpreszler commented Sep 13, 2023

Uh oh!

drbenvincent commented Sep 14, 2023

Uh oh!

jpreszler commented Sep 14, 2023

Uh oh!

drbenvincent commented Sep 14, 2023

Uh oh!

jpreszler commented Sep 14, 2023

Uh oh!

drbenvincent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Aug 24, 2023 •

edited

Loading