Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip checksum handling if no checksum check is possible #565

Merged
merged 2 commits into from
Dec 2, 2022

Conversation

jordanpadams
Copy link
Member

If we know there is no checksum manifest or a checksum in the label, we don't need to handle any checksums.

Previous functionality would continue through this method and generate a checksum, significantly slowing down execution (especially for very large files).

With debugging enabled, here is some of the output that include the generation of a checksum for the file:

[main] DEBUG gov.nasa.pds.tools.util.MD5Checksum - createChecksum:url,bytesRead file:/Users/jpadams/test/geo_issue_20221201/s_07171001_sim.img,10240
[main] DEBUG gov.nasa.pds.tools.util.MD5Checksum - createChecksum:url,bytesRead file:/Users/jpadams/test/geo_issue_20221201/s_07171001_sim.img,10240
[main] DEBUG gov.nasa.pds.tools.util.MD5Checksum - createChecksum:url,bytesRead file:/Users/jpadams/test/geo_issue_20221201/s_07171001_sim.img,10240
[main] DEBUG gov.nasa.pds.tools.util.MD5Checksum - createChecksum:url,bytesRead file:/Users/jpadams/test/geo_issue_20221201/s_07171001_sim.img,10240
[main] DEBUG gov.nasa.pds.tools.util.MD5Checksum - createChecksum:url,bytesRead file:/Users/jpadams/test/geo_issue_20221201/s_07171001_sim.img,10240
[main] DEBUG gov.nasa.pds.tools.util.MD5Checksum - createChecksum:url,bytesRead file:/Users/jpadams/test/geo_issue_20221201/s_07171001_sim.img,10240
[main] DEBUG gov.nasa.pds.tools.util.MD5Checksum - createChecksum:url,bytesRead file:/Users/jpadams/test/geo_issue_20221201/s_07171001_sim.img,10240

followed by a note that we don't do anything with it:

[main] DEBUG gov.nasa.pds.tools.validate.rule.pds4.FileReferenceValidationRule - handleChecksum:No checksum to compare against in the product label for 'file:/Users/jpadams/test/geo_issue_20221201/s_07171001_sim.img'

new functionality now eliminates the generation of a checksum for the file for no reason.

If we know there is no checksum manifest or a checksum in the label, we don't need to handle
any checksums.

Previous functionality would continue through this method and generate a checksum, significantly slowing
down execution (especially for very large files).
conditional matches statement in message
Copy link
Contributor

@al-niessner al-niessner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only question is the checksum required or not in the PDS4 product? If so, then this should fail noisily.

Copy link
Member

@nutjob4life nutjob4life left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jordanpadams
Copy link
Member Author

My only question is the checksum required or not in the PDS4 product? If so, then this should fail noisily.

@al-niessner it is not required in the label. if it were, the schema validation would catch this and raise an error.

eventually, we plan on managing checksums via the registry and use that for integrity checking, versus putting this information in the metadata labels.

@jordanpadams jordanpadams merged commit f49feeb into main Dec 2, 2022
@jordanpadams jordanpadams deleted the handle_checksum branch December 2, 2022 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants