Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conflicting namespace prefixes during ListRecords #36

Open
bertsky opened this issue Dec 22, 2023 · 2 comments
Open

conflicting namespace prefixes during ListRecords #36

bertsky opened this issue Dec 22, 2023 · 2 comments

Comments

@bertsky
Copy link

bertsky commented Dec 22, 2023

If you do a harvest during which the same prefix will be seen with different URL targets, metha-sync will jumble the prefix – suffixing it by 1 but never declaring that renamed prefix, so the resulting XMLs become invalid.

For example, if I do

metha-sync -format mets -set 17th-century-prints http://digital.slub-dresden.de/oai/

then (because in our MODS the namespace for the extension slub has been changed some time ago and now appears in some records with declaration http://www.slub-dresden.de/namespace but with http://www.slub-dresden.de/ in others) I end up with altered and non-wellformed METS files. For example in oai:de:slub-dresden:db:id-1840307358, instead of…

               <mods:extension>
                  <slub:slub>
                     <slub:id type="digital">1840307358</slub1:id>
                     <slub:id type="source">113051157X</slub1:id>
                     <slub:id type="tsl-ats">Mercgeovg</slub1:id>
                  </slub:slub>
               </mods:extension>
               <mods:recordInfo>
                  <mods:recordIdentifier source="http://digital.slub-dresden.de/oai/">oai:de:slub-dresden:db:id-1840307358</mods:recordIdentifier>
               </mods:recordInfo>

…(which is what you get for a single GetRecord request) I now see…

               <mods:extension>
                  <slub1:slub>
                     <slub1:id type="digital">1840307358</slub1:id>
                     <slub1:id type="source">113051157X</slub1:id>
                     <slub1:id type="tsl-ats">Mercgeovg</slub1:id>
                  </slub1:slub>
               </mods:extension>
               <mods:recordInfo>
                  <mods:recordIdentifier source="http://digital.slub-dresden.de/oai/">oai:de:slub-dresden:db:id-1840307358</mods:recordIdentifier>
               </mods:recordInfo>

…(which is invalid, because slub1 has never been introduced).

@miku
Copy link
Owner

miku commented Dec 27, 2023

Thanks for the detailed bug report - that's certainly an interesting issue and I'll try to take a look at it shortly - it may also be some issue in the stdlib, as per golang/go #48641.

@miku
Copy link
Owner

miku commented Apr 8, 2024

I'm afraid this is a Go stdlib XML issue first, cf. golang/go#13400.

But then, metha is mostly concerned with the envelope and that should be much less problematic. This will requires some internal rewrite and may take a while before it is released, just as a heads up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants