Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction error for non-fe model #573

Open
wants to merge 48 commits into
base: master
Choose a base branch
from

Conversation

greenguy33
Copy link
Contributor

This pull request references issue #566

This is not a finished product but I took a first pass at implementing prediction standard error (stdp) within the predict function. As discussed it is only implemented for models without fixed effects for now. I'm creating this pull request to solicit feedback on the approach used. I confirmed that results returned are the same as statsmodels' get_prediction() function for a test dataset.

Note that CI's are not implemented, but the prediction error is returned. In order to implement CIs, we need to provide a z-score based on the percentage CI we want and then do some simple arithmetic. For the sake of keeping things simple and not adding too many new arguments to the predict() function, I left that out for now.

greenguy33 and others added 25 commits June 28, 2024 16:56

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
rename `self._vcov` to `self.vcov` in the `lpdid` class
the vcov attribute is always available =)
replacing .vcov with ._vcov
.vcov to ._vcov
`uhat` as dense 1D array
@s3alfisc
Copy link
Member

s3alfisc commented Aug 1, 2024

Looks very good so far! 👍😄

I guess at some point we might want to clean up the different concepts of predicted values, i.e. self._Y_hat, self._self._Y_untransformed, and I also think there is also self._Y_hat_link`?

I suppose next steps would be to implement some basic tests (i.e. against statsmodels), to add an example, and maybe update the docs? I am not sure if the predict method is even mentioned in quickstart.ipynb 🤔

@@ -1436,6 +1436,7 @@ def predict(
atol: float = 1e-6,
btol: float = 1e-6,
type: str = "link",
compute_stdp: bool = False
) -> np.ndarray:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I think that a few unit tests fail because the code assumes that predict returns a np.ndarray - if we change to a pd.DataFrame (which I think makes most sense), we need to change these.

Alternatively, we could change to -> Union[np.ndarray, pd.DataFrame] and simply return the numpy array if compute_stdp = False.

@greenguy33
Copy link
Contributor Author

Yep, I didn't touch the tests or docs yet but happy to do that as a next step!
I don't think predict is mentioned in the docs because when I started using the library I had to go digging for it :) I'll make sure to add that when I have a chance to make the next update.
As a github related sidenote, do you know if there is a way to squash commits on my end when I make a pull request from my fork? It feels very dirty to be submitting a pull request with 25 commits when only 3-4 of them are relevant to the current project.

@s3alfisc
Copy link
Member

s3alfisc commented Aug 1, 2024

As a github related sidenote, do you know if there is a way to squash commits on my end when I make a pull request from my fork? It feels very dirty to be submitting a pull request with 25 commits when only 3-4 of them are relevant to the current project.

Yes, I can do a squash commit on my end when merging, so many, many commits are no problem at all, I have this all the time :D
image

Yep, I didn't touch the tests or docs yet but happy to do that as a next step!
I don't think predict is mentioned in the docs because when I started using the library I had to go digging for it :) I'll make sure to add that when I have a chance to make the next update.

Sounds great!

Copy link

codecov bot commented Aug 31, 2024

Codecov Report

Attention: Patch coverage is 81.13208% with 10 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pyfixest/estimation/feols_.py 73.68% 5 Missing ⚠️
pyfixest/estimation/fepois_.py 0.00% 5 Missing ⚠️
Files with missing lines Coverage Δ
pyfixest/did/did2s.py 90.00% <100.00%> (+1.11%) ⬆️
tests/test_predict_resid_fixef.py 92.80% <100.00%> (+0.49%) ⬆️
tests/test_stdp.py 100.00% <100.00%> (ø)
pyfixest/estimation/feols_.py 86.95% <73.68%> (ø)
pyfixest/estimation/fepois_.py 84.81% <0.00%> (ø)

... and 24 files with indirect coverage changes

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@s3alfisc
Copy link
Member

pre-commit.ci autofix

@s3alfisc
Copy link
Member

Hi Hayden (@greenguy33), I am finally getting around to pick this up - would it be ok if I took the newly added functions and opened a new PR?

@s3alfisc s3alfisc mentioned this pull request Dec 16, 2024
7 tasks
@greenguy33
Copy link
Contributor Author

Hi Hayden (@greenguy33), I am finally getting around to pick this up - would it be ok if I took the newly added functions and opened a new PR?

@s3alfisc Absolutely, and sorry for dropping off the face of the earth after submitting this. I've been absolutely swamped with my grad work since then.

@s3alfisc
Copy link
Member

Cool! And no worries at all, I've been there myself... hope all your papers will progress smoothly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants