Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unicode normalization #559

Open
pwinckles opened this issue Aug 20, 2021 · 1 comment
Open

unicode normalization #559

pwinckles opened this issue Aug 20, 2021 · 1 comment
Milestone

Comments

@pwinckles
Copy link

pwinckles commented Aug 20, 2021

I was thinking about unicode normalization again. I know last time this was discussed, perhaps it was on Slack, that normalization was considered outside of the scope of the spec. However, I had a couple of additional thoughts after seeing that the BagIt spec spends time describing the normalization problem and then recommends that implementations tolerate differences in normalization and warn when there are files that differ by normal form only.

  1. Perhaps, it would make sense if OCFL validators produced warnings if there are files or object ids that only differ based on how they are normalized?
  2. Should the spec make any similar recommendations, perhaps in the implementation notes, about tolerating differences in normalization forms? Or is this not desirable behavior?
  3. The spec states "Each version block in each prior inventory file MUST represent the same object state as the corresponding version block in the current inventory file." In case of logical paths, is it up to the implementation to decide if this is a byte-for-byte comparison or a normalized comparison? (Edit: noting that digest algorithm changes are supported between versions.)
@rosy1280
Copy link
Contributor

rosy1280 commented Feb 3, 2022

We think discussion of this issue would be best in the implementation notes and are deferring to 2.0 because of the complications related to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants