Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make crs bigger for new pars file for jump unit tests #1356

Merged
merged 1 commit into from
Aug 12, 2024

Conversation

braingram
Copy link
Collaborator

@braingram braingram commented Aug 8, 2024

Make CRs bigger than the threshold used for the new pars files.

Fixes the unit test failures seen in:
#1355
due to the new pars file:
https://roman-crds.stsci.edu/browse/roman_wfi_pars-jumpstep_0001.asdf

Regression tests: https://github.com/spacetelescope/RegressionTests/actions/runs/10308257135

show only expected differences due to the new crds context:

E         {'values_changed': {"root['roman']['meta']['ref_file']['crds']['context_used']": {'new_value': 'roman_0063.pmap',
E                                                                                           'old_value': 'roman_0061.pmap'}}}

Checklist

  • added entry in CHANGES.rst under the corresponding subsection
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)
  • ran regression tests, post a link to the Jenkins job below. How to run regression tests on a PR

@braingram braingram changed the title make crs bigger for new pars file make crs bigger for new pars file for jump unit tests Aug 8, 2024
@braingram braingram marked this pull request as ready for review August 8, 2024 19:13
@braingram braingram requested a review from a team as a code owner August 8, 2024 19:13
Copy link

codecov bot commented Aug 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.57%. Comparing base (4086056) to head (0c13448).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1356   +/-   ##
=======================================
  Coverage   78.57%   78.57%           
=======================================
  Files         117      117           
  Lines        7861     7861           
=======================================
  Hits         6177     6177           
  Misses       1684     1684           
Flag Coverage Δ *Carryforward flag
nightly 62.26% <ø> (ø) Carriedforward from 4086056

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@schlafly
Copy link
Collaborator

schlafly commented Aug 8, 2024

Do we think that we if we just add CRDS_CONTEXT to the list of parameters that this will go away? I think we don't want every time a new context comes along for us to have to update all of the regression test files?

@braingram
Copy link
Collaborator Author

braingram commented Aug 8, 2024

Do we think that we if we just add CRDS_CONTEXT to the list of parameters that this will go away? I think we don't want every time a new context comes along for us to have to update all of the regression test files?

Do you mean define CRDS_CONTEXT to a specific context for the regression tests or add crds_context to the list of items ignored for compare_asdf?

For the latter, It does appear that jwst ignores the crds context for fitsdiff comparisons of generated to truth files:
https://github.com/spacetelescope/jwst/blob/4ecb40c57aaf1ae99fd6d8d9be66b7530277cf40/jwst/regtest/conftest.py#L236
That should allow the regtests to pass with this PR. Would you like me to add that to this PR and see (this PR seems like a good test)? EDIT: I added this change in edef4f6 so we can test it out

@braingram
Copy link
Collaborator Author

I have no clue at the moment why one regtest failed with the following 🤷

FAILED romancal/regtest/test_wfi_pipeline.py::test_level2_image_processing_pipeline - AssertionError: 
  Diff report for:
      result file: /runner/_work/_temp/pytest_basetemp/popen-gw3/test_level2_image_processing_p0/r0000101001001001001_01101_0001_WFI01_cal_repoint.asdf
          model type: ImageModel
      truth file: /runner/_work/_temp/pytest_basetemp/popen-gw3/test_level2_image_processing_p0/truth/r0000101001001001001_01101_0001_WFI01_cal_repoint.asdf
          model type: ImageModel
  
  {'values_changed': {"root['roman']['meta']['wcs_fit_results']['<rot>']": {'new_value': -5.9008763525942403e-05,
                                                                            'old_value': -7.897752747117105e-06},
                      "root['roman']['meta']['wcs_fit_results']['center'][0]": {'new_value': -249.45421319021025,
                                                                                'old_value': -250.89413395689374},
                      "root['roman']['meta']['wcs_fit_results']['center'][1]": {'new_value': 109.61122052869052,
                                                                                'old_value': 108.42239876672924},
                      "root['roman']['meta']['wcs_fit_results']['mae']": {'new_value': 0.002684014626406904,
                                                                          'old_value': 0.0028079485287068238},
                      "root['roman']['meta']['wcs_fit_results']['matrix'][0][1]": {'new_value': -1.0298972110582511e-06,
                                                                                   'old_value': -1.3784178894562014e-07},
                      "root['roman']['meta']['wcs_fit_results']['matrix'][1][0]": {'new_value': 1.0298972110582511e-06,
                                                                                   'old_value': 1.3784178894562014e-07},
                      "root['roman']['meta']['wcs_fit_results']['nmatches']": {'new_value': 109,
                                                                               'old_value': 110},
                      "root['roman']['meta']['wcs_fit_results']['proper_rot']": {'new_value': -5.9008763525942403e-05,
                                                                                 'old_value': -7.897752747117105e-06},
                      "root['roman']['meta']['wcs_fit_results']['rmse']": {'new_value': 0.003440309699826702,
                                                                           'old_value': 0.0037193506957389423},
                      "root['roman']['meta']['wcs_fit_results']['rot'][0]": {'new_value': -5.9008763525942403e-05,
                                                                             'old_value': -7.897752747117105e-06},
                      "root['roman']['meta']['wcs_fit_results']['rot'][1]": {'new_value': -5.9008763525942403e-05,
                                                                             'old_value': -7.897752747117105e-06},
                      "root['roman']['meta']['wcs_fit_results']['shift'][0]": {'new_value': -0.008499128453755975,
                                                                               'old_value': -0.00867141855839393},
                      "root['roman']['meta']['wcs_fit_results']['shift'][1]": {'new_value': 0.014101158752967921,
                                                                               'old_value': 0.014000953961758062}}}

@schlafly
Copy link
Collaborator

schlafly commented Aug 9, 2024

This is the one where I need the second okify. The change to tweakwcs defaults led a different star to be included in the match, making for directly different fit results. Sorry, I can okay that tomorrow.

@schlafly
Copy link
Collaborator

schlafly commented Aug 9, 2024

I okified the wcs_fit_results issue. I was suggesting adding CRDS_CONTEXT=roman_0061.pmap, as in this run: https://github.com/spacetelescope/RegressionTests/actions/runs/10319437845

I am just skeptical in general that for the regression tests we can use whatever context is the latest; it seems to me like that guarantees that regtests will fail if they use the new context, and if they don't use the new context then having the latest context isn't buying us anything?

@braingram
Copy link
Collaborator Author

braingram commented Aug 9, 2024

Thanks! I reran the regtests and they all pass with this PR:
https://github.com/spacetelescope/RegressionTests/actions/runs/10308257135
It sounds like the open questions are:

  • should we ignore roman.meta.ref_file.crds.context_used in the compare_asdf diffs? This PR currently ignores it.
  • should the romcancal regtests pin the CRDS_CONTEXT? I think this is worth some discussion. If we keep the ignore for roman.meta.ref_file.crds.context_used we don't need to come to a conclusion on this. Even if we don't ignore it we could re-okify the regtests with the new context.

What do you think is a good forum for discussing pinning the CRDS_CONTEXT for the regtests? I'm happy to put stuff in this PR, a github issue, a jira ticket, etc.

I am just skeptical in general that for the regression tests we can use whatever context is the latest; it seems to me like that guarantees that regtests will fail if they use the new context, and if they don't use the new context then having the latest context isn't buying us anything?

Once things have been established (reference file formats are finalized, etc) I would expect that a new context should only produce expected differences in some data arrays (so if a new dark is added and put in use the regtests that compare the output from the dark currant step will fail until the new results are oked). I believe these are expected differences that benefit users since presumably the new results are more accurate. This does make the regression tests serve at least 2 purposes (and can lead to confusion):

  • does a code change impact the resulting data
  • does a crds context change impact the resulting data

And the second one sort of randomly pops up whenever a new context happens to be published. Is there any other place where a new context is tested (by running the romancal regtests) before it's made "live"? As @nden mentioned this morning, some of this will change when there is a crds ops server for roman but that sounds like it's a ways off.

@braingram braingram changed the title make crs bigger for new pars file for jump unit tests make crs bigger for new pars file for jump unit tests, add roman.meta.ref_file.crds.context_used to list of ignored paths for compare_asdf Aug 9, 2024
@schlafly
Copy link
Collaborator

schlafly commented Aug 9, 2024

I agree with you that the "2 purposes" are the problem. It's good to know that we are sensitive to changes to reference files, but in practice, at least I'm pretty removed from those updates and I'm not sure I want to be sensitive to every time a new reference file goes in. Were we to move in that direction we'd want to coordinate better with CRDS about the deliveries.

I worry that at present it's going to be more like "tests are failing again, let's just update them, maybe CRDS changed," which would not be good.

I could imagine another set of tests (like the Mac tests, or something) where we use the latest context and observe failures but which doesn't drive development. But those would have to live in a separate artifactory directory if we ever wanted to okify them.

@ddavis-stsci , do you have a preference here? I think my sense is that as a temporary expedient let's go ahead and ignore context_used as you do here, but longer term I think I would lean towards our old model (fixing a context for the regression tests) and then changing from that later once we work out a new approach. We can take this up again at next week's romancal tag up.

@ddavis-stsci
Copy link
Collaborator

I would not ignore the CRDS context. We tie each release to a specific context and should test against that.
I assume that by "latest" you mean the operational context? We should always be testing again the reference files that have been deemed official.
If we don't check the crds_context I worry that it is too easy to run tests using one context and the code published but does
not work with the operational context or has unexpected results.

Do the regression tests have a variety of purposes - yes. Should you okify tests because you "think" you have a crds change, No.

If the regression tests fail we need to understand why. If the code or the reference files have changed and the new results are more accurate the running okify is a convenient way to update the files. If we do not understand why the results do not match the truth files the truth files should not be updated just the make them match.

@schlafly
Copy link
Collaborator

schlafly commented Aug 9, 2024

Okay, I take your message as saying that we should find a way to force the regression tests to use a specific context. Right now they are using the latest context, which brings in context changes that lead to regression test failures. I don't know how to force GA to do that in the short term, so I'd propose using this PR of Brett's to ignore the context in the short term, while planning to revert that change after we get the GA to always use a particular context.

@ddavis-stsci
Copy link
Collaborator

I'd suggest that we keep the crs changes and punt on the crds context issue. I think the crds context check needs more discussion.

@braingram braingram changed the title make crs bigger for new pars file for jump unit tests, add roman.meta.ref_file.crds.context_used to list of ignored paths for compare_asdf make crs bigger for new pars file for jump unit tests Aug 12, 2024
@braingram
Copy link
Collaborator Author

braingram commented Aug 12, 2024

I'd suggest that we keep the crs changes and punt on the crds context issue. I think the crds context check needs more discussion.

I dropped the crds context ignore. The regression tests will continue to fail due to the new crds context.

Copy link
Collaborator

@ddavis-stsci ddavis-stsci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ddavis-stsci ddavis-stsci merged commit 5e7e4fd into spacetelescope:main Aug 12, 2024
30 checks passed
@braingram braingram deleted the bigger_crs branch August 12, 2024 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants