Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Information package validation fails with unconventional file names. #301

Open
carlwilson opened this issue Nov 21, 2024 · 1 comment
Open
Assignees

Comments

@carlwilson
Copy link

Some of the test files we have encountered have names that cause commons-ip to return a bad code. Here's a demonstration using the same file with two different names:

❯ sha256sum /home/cfw/Downloads/https+==doi,org=10,5281=zenodo,3736.zip /home/cfw/Downloads/goodname.zip
b6cd5ef9bd1ca9d3afb6604c0e6e9222b8c8fc2e2dcb8b297ac650a0c6650a79  /home/cfw/Downloads/https+==doi,org=10,5281=zenodo,3736.zip
b6cd5ef9bd1ca9d3afb6604c0e6e9222b8c8fc2e2dcb8b297ac650a0c6650a79  /home/cfw/Downloads/goodname.zip

❯ java -jar commons-ip/commons-ip2-cli-2.8.0.jar validate -i /home/cfw/Downloads/https+==doi,org=10,5281=zenodo,3736.zip
E-ARK SIP validation report at '/home/cfw/Projects/eArchiving/validation/eark-rest-services/https+==doi_validation-report_2024-11-21.json'
Cannot invoke "picocli.CommandLine$ParseResult.originalArgs()" because the return value of "picocli.CommandLine.getParseResult()" is null
Try 'commons-ip validate --help' for more information.
If you think you've found a bug, please report the log file located in /home/cfw/Projects/eArchiving/validation/eark-rest-services/commons-ip.log.txt to https://github.com/keeps/commons-ip/issues%                                                                                                                                                                                        
❯ echo $?
1

❯ java -jar commons-ip/commons-ip2-cli-2.8.0.jar validate -i /home/cfw/Downloads/goodname.zip
E-ARK SIP validation report at '/home/cfw/Projects/eArchiving/validation/eark-rest-services/goodname.zip_validation-report_2024-11-21.json'
❯ echo $?
0

In both cases the same JSON report (bar file name details) is produced but the badly name file causes a non zero return status. This in turn snags the automated task that depends on the return status.

@carlwilson
Copy link
Author

I took a closer look, and the general issue is file names containing the , character. This line splits the -i parameter value using the comma as a delimiter. This causes sipPaths to hold multiple dummy file entries. Here's a quick debug shot:
image.
The first entry gets processed properly, hence the report gets created but processing the erroneous org=10 entry causes a crash. Given there's already an assignee I'll leave the fix to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants