Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MinTrace's protection to Schafer-Strimmer covariance, eliminated stat… #97

Merged
merged 7 commits into from
Oct 28, 2022

Conversation

kdgutier
Copy link
Collaborator

@kdgutier kdgutier commented Oct 28, 2022

Solved eigenvalue indetermination in the presence of small percentage of zero predictions.

  • Added protections for single series with constant values / std=0.
  • Added protections to standardize residuals when std=0.
  • Final epsilon diagonal for improved numerical stability.
  • Zero residuals in MinTrace warning.
  • Added stress test unit test for percentage of zero residuals scenarios in core.ipynb
  • Added single insample value MinTrace-shr covariance stress test in methods.ipynb (ridge diagonal solved this problem).

I took out the statsmodels, the only function we used from it was cov2corr that is particularly simple and does not merit such a big dependency.

@kdgutier kdgutier requested a review from cchallu October 28, 2022 01:18
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@kdgutier kdgutier linked an issue Oct 28, 2022 that may be closed by this pull request
Copy link
Member

@AzulGarza AzulGarza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left some comments. As we are not testing the nbs/examples files automatically, please just check that we are able to recover the previous outputs/replications. I'll work on an automatic test to help future development.

hierarchicalforecast/core.py Outdated Show resolved Hide resolved
hierarchicalforecast/methods.py Show resolved Hide resolved
hierarchicalforecast/methods.py Outdated Show resolved Hide resolved
environment.yml Show resolved Hide resolved
@AzulGarza AzulGarza self-requested a review October 28, 2022 02:31
Copy link
Member

@AzulGarza AzulGarza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if this PR is going close #92, please let's add a test where it is verified that before the changes, the same error arises, and now with these changes, it is solved.

See, for example, this PR.

A user had an error using HierarchicalEvaluation when h=1, then a test was included showing that it works. In particular, these lines,

image

This particular case is a little bit tricky because we don't have the actual data. I'd start trying to reproduce the error with a small example. Just to be sure that these changes are actually solving the problem.

@AzulGarza
Copy link
Member

AzulGarza commented Oct 28, 2022

Also, if this PR is going close #92, please let's add a test where it is verified that before the changes, the same error arises, and now with these changes, it is solved.

See, for example, this PR.

A user had an error using HierarchicalEvaluation when h=1, then a test was included showing that it works. In particular, these lines,

image

This particular case is a little bit tricky because we don't have the actual data. I'd start trying to reproduce the error with a small example. Just to be sure that these changes are actually solving the problem.

Just some thoughts on the test:

An element of the diagonal of the variance-covariance matrix would be zero if a time series has only one value, so to reproduce the error, we could use the objects used for testing purposes in nbs/methods.ipynb as follows,

diff_len_y_insample = S @ y_bottom
diff_len_y_hat_insample = S @ y_hat_bottom_insample
diff_len_y_insample[-1, :-1] = np.nan
diff_len_y_hat_insample[-1, :-1] = np.nan
print(diff_len_y_insample)
cls_min_trace = MinTrace(method='mint_shrink')
cls_min_trace(
    S=S, 
    y_hat=S @ y_hat_bottom, 
    y_insample=diff_len_y_insample,
    y_hat_insample=diff_len_y_hat_insample,
    idx_bottom=idx_bottom
)

The output of the previous code is,

image

This small example recovers the #92 bug. Observe that if we use ols, there is no problem (as expected),

image

The user's data has time series with a single value.

image

@kdgutier kdgutier removed the request for review from cchallu October 28, 2022 11:46
@kdgutier kdgutier merged commit 437e5b1 into main Oct 28, 2022
@kdgutier kdgutier deleted the debug_mint branch October 28, 2022 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Min Trace method="mint_shrink"
2 participants