Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[iso19139:2007] add more generic char/anchor check #954

Merged
merged 2 commits into from
Nov 28, 2024

Conversation

pvgenuchten
Copy link
Contributor

Anchor is used in quite places in the inspire profile on iso19139:2007

Add xxx_url property to keep backwards compatibility

Adds tests based on a record, derived from nationaalgeoregister.nl

@coveralls
Copy link

coveralls commented Oct 29, 2024

Coverage Status

coverage: 60.203% (+0.05%) from 60.156%
when pulling a0e2029 on pvgenuchten:evaluate_anchor_generic
into ae98c20 on geopython:master.

@geographika
Copy link
Contributor

@pvgenuchten - do you have a link to the raw XML used as the basis for your test record?

It may also be good to just add the XML snippet that this pull request is solving (with and without a /gmx:Anchor).

The code looks good to me, but I've no experience with iso19139:2007. Will leave open for comment for a couple of weeks prior to merging.

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Oct 30, 2024

updated method text

raw xml is at
https://www.nationaalgeoregister.nl/geonetwork/srv/metadata/f44dac86-2228-412f-8355-e56446ca9933/formatters/xml

a thing i would like your view on is this: if a list consists of charstring, anchor, charstring, anchor, would you expect:

{
name: [foo,la,lo,li]
url: [http://la,http:li]
}

or

{
name: [foo,la,lo,li]
url: [None,http://la,None,http:li]
}

@pvgenuchten
Copy link
Contributor Author

Yes, I can see the point of not exposing such real life metadata as part of a software, on the other side it is an indication of the fact that these patterns are used in the wild…

@geographika
Copy link
Contributor

updated method text

raw xml is at https://www.nationaalgeoregister.nl/geonetwork/srv/metadata/f44dac86-2228-412f-8355-e56446ca9933/formatters/xml

a thing i would like your view on is this: if a list consists of charstring, anchor, charstring, anchor, would you expect:

{
name: [foo,la,lo,li]
url: [http://la,http:li]
}

or

{
name: [foo,la,lo,li]
url: [None,http://la,None,http:li]
}

I'd imagine it would be expected the name and URL list lengths would match, although not sure about use of None. An empty string may be better in this case?

@pvgenuchten do you have an example of the above from the XML you linked to that I could test the above question with?
I've tested the branch and new properties with the following Python:

>>> from owslib.iso import *
>>>
>>> md = MD_Metadata(etree.parse('test.xml'))
>>> iden = md.identification[0]
>>> iden.contact[0].organization_url
'http://standaarden.overheid.nl/owms/terms/Ministerie_van_Defensie'
>>> iden.contact[0].name_url
''
>>> iden.contact[0].name
'Ministerie van Defensie, Koninklijke Marine, Dienst der Hydrografie'
>>> iden.otherconstraints
['Geen beperkingen', 'Er zijn geen condities voor toegang en gebruik', 'Geen beperkingen voor publieke toegang']

>>> dist = md.distribution
>>> dist.specification_url
'http://inspire.ec.europa.eu/id/document/tg/hy'

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Nov 7, 2024

updated PR to have matching lengths of xx vs xx_url

combined use is available in

<gmd:otherConstraints>
<gco:CharacterString>Data beschikbaar voor hergebruik volgens de Modellicentie Gratis Hergebruik. Toelichting beschikbaar op https://www.dov.vlaanderen.be/page/gebruiksvoorwaarden-dov-services</gco:CharacterString>
</gmd:otherConstraints>
<gmd:otherConstraints>
<gmx:Anchor xlink:href="https://inspire.ec.europa.eu/metadata-codelist/ConditionsApplyingToAccessAndUse/noConditionsApply">Geen beperkingen</gmx:Anchor>
</gmd:otherConstraints>

another sample in the wild is this one

https://www.nationaalgeoregister.nl/geonetwork/srv/api/records/1e9cf0ef-f905-46df-b4f8-2bab7a3808bb/formatters/xml

@geographika
Copy link
Contributor

Tested this branch with the live URL.

from owslib.util import openURL
from owslib.iso import MD_Metadata
from lxml import etree
url = 'https://www.nationaalgeoregister.nl/geonetwork/srv/api/records/1e9cf0ef-f905-46df-b4f8-2bab7a3808bb/formatters/xml'

response = openURL(url)
xml_content = response.read()
xml_tree = etree.fromstring(xml_content)
md = MD_Metadata(xml_tree)


iden = md.identification[0]
# ['Te gebruiken onder voorwaarden van ESRI licentie', 'Niet Commercieel, Geen Afgeleide Werken, Naamsvermelding verplicht, organisatienaam']

This picks up values from both CharacterString and Anchor.

<gmd:otherConstraints>
<gco:CharacterString>Te gebruiken onder voorwaarden van ESRI licentie</gco:CharacterString>
</gmd:otherConstraints>
...
<gmd:otherConstraints>
<gmx:Anchor xlink:href="http://creativecommons.org/licenses/by-nc-nd/4.0/deed.nl">
Niet Commercieel, Geen Afgeleide Werken, Naamsvermelding verplicht, organisatienaam
</gmx:Anchor>
</gmd:otherConstraints>

Failing CI issues are unrelated. Will merge shortly unless any issues raised.

@geographika geographika merged commit d5e650d into geopython:master Nov 28, 2024
0 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants