unicode normalization #559

pwinckles · 2021-08-20T15:16:43Z

I was thinking about unicode normalization again. I know last time this was discussed, perhaps it was on Slack, that normalization was considered outside of the scope of the spec. However, I had a couple of additional thoughts after seeing that the BagIt spec spends time describing the normalization problem and then recommends that implementations tolerate differences in normalization and warn when there are files that differ by normal form only.

Perhaps, it would make sense if OCFL validators produced warnings if there are files or object ids that only differ based on how they are normalized?
Should the spec make any similar recommendations, perhaps in the implementation notes, about tolerating differences in normalization forms? Or is this not desirable behavior?
The spec states "Each version block in each prior inventory file MUST represent the same object state as the corresponding version block in the current inventory file." In case of logical paths, is it up to the implementation to decide if this is a byte-for-byte comparison or a normalized comparison? (Edit: noting that digest algorithm changes are supported between versions.)

rosy1280 · 2022-02-03T16:49:09Z

We think discussion of this issue would be best in the implementation notes and are deferring to 2.0 because of the complications related to it.

zimeon added Needs Discussion OCFL Object labels Nov 3, 2021

pwinckles mentioned this issue Jan 21, 2022

clarify "same object state" of version block (E066) #571

Closed

rosy1280 added the Deferred to V2 label Feb 3, 2022

rosy1280 added this to the 2.0 milestone Feb 3, 2022

zimeon removed the Deferred to V2 label Sep 22, 2023

rosy1280 removed the OCFL Object label Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unicode normalization #559

unicode normalization #559

pwinckles commented Aug 20, 2021 •

edited

Loading

rosy1280 commented Feb 3, 2022

unicode normalization #559

unicode normalization #559

Comments

pwinckles commented Aug 20, 2021 • edited Loading

rosy1280 commented Feb 3, 2022

pwinckles commented Aug 20, 2021 •

edited

Loading