Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping Information Package Identifiers to File-System Safe Names #748

Open
shsdev opened this issue Oct 21, 2024 · 2 comments
Open

Mapping Information Package Identifiers to File-System Safe Names #748

shsdev opened this issue Oct 21, 2024 · 2 comments

Comments

@shsdev
Copy link
Contributor

shsdev commented Oct 21, 2024

An issue was reported by stakeholders concerning requirement CSIPSTR2:

CSIPSTR2: The Information Package root folder SHOULD be named with the ID or name of the Information Package, that is the value of the package METS.xml's root `<mets>` element's `@OBJID` attribute.

Enforcing that the name must be the same as the attribute may cause file system interoperability issues because certain characters used in identifiers may cause errors in specific file systems.

There is a need to make sure that the translation of the packages' identifier into a file or folder name is conformant with constraints in different types of commonly used file systems, such as NTFS or FAT32 on Windows, Ext4 or XFS on Linux etc.

Our recommendation is to use Kunze's section 3 of the pair tree specification as the starting point:

https://www.ietf.org/archive/id/draft-kunze-pairtree-01.txt

As Kunze's pairtree specification is outdated (Expired May 29, 2009) we suggest taking over the relevant section 3, adapt it, and create a new appendix in the CSIP named “Mapping Object Identifiers to File-System Safe Names for Interoperability”.

CSIPSTR2 would then reference the appendix:

CSIPSTR2: The Information Package root folder SHOULD be named using the ID or name of the Information Package, which is the value of the package `METS.xml`'s root <mets> element's `@OBJID` attribute. When creating the folder name, the 'Mapping Object Identifiers to File-System Safe Names for Interoperability' process SHOULD be applied to ensure compatibility with file system naming conventions.

Apart from this identifier-filename mapping specification other possible file system issues can be dealt with at the same time. for example, enforcing the use of case sensitive information package naming may cause issues on file systems which are not case sensitive, such as NTFS (case-preserving but not differentiating) or FAT32 (entirely case-insensitive) on Windows.

@karinbredenberg
Copy link
Contributor

Observe that the wording is SHOULD not MUST.

@shsdev
Copy link
Contributor Author

shsdev commented Nov 29, 2024

Due to special characters used in many types of identifiers, it may not always be possible to comply with this SHOULD requirement. Although the requirement is not mandatory, it would be reasonable to expect that conformity with the SHOULD requirements can be achieved under standard conditions, such as when using common identification schemes for digital objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants