Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stray metadata files in zip #21

Open
tfmorris opened this issue Jan 19, 2016 · 2 comments
Open

Stray metadata files in zip #21

tfmorris opened this issue Jan 19, 2016 · 2 comments

Comments

@tfmorris
Copy link
Collaborator

I'm not sure what's going on here, but the zip for 000000037 also contains the metadata file 000000218_metadata.xml as well as the correct 000000037_metadata.xml

If this error was introduced at the BL, it's something that we'll need to watch out for when processing.

@JonathanReeve
Copy link
Member

Crazy. I'll look into it. While I'm at it, I'll get a bunch more samples and add them to this repo.

@tfmorris
Copy link
Collaborator Author

Actually, it's not just the metadata file. I didn't notice before, but the ALTO directory has all the pages for that volume as well. It's basically two entire volumes merged into a single zip file.

We can code for it if it's something that happens regularly, but if there's a 000000218_ zip file that has the right content, the easiest thing would be to just ignore the stray files (which is what I currently do).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants