-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dct identifier to publisher #301
Add dct identifier to publisher #301
Conversation
83a1ddf
to
aefa22c
Compare
ckanext/dcat/converters.py
Outdated
elif isinstance(dcat_publisher, dict) and dcat_publisher.get('name'): | ||
package_dict['extras'].append({'key': 'dcat_publisher_name', 'value': dcat_publisher.get('name')}) | ||
package_dict['extras'].append({'key': 'dcat_publisher_email', 'value': dcat_publisher.get('mbox')}) | ||
if dcat_publisher.get('name'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a double checl for `.get('name') which can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the second one
@@ -66,6 +66,11 @@ dataset_fields: | |||
|
|||
- field_name: type | |||
label: Type | |||
|
|||
- field_name: identifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add it to the dcat_ap_full.yaml file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do
@@ -123,7 +123,7 @@ def _parse_dataset_base(self, dataset_dict, dataset_ref): | |||
|
|||
# Publisher | |||
publisher = self._publisher(dataset_ref, DCT.publisher) | |||
for key in ("uri", "name", "email", "url", "type"): | |||
for key in ("uri", "name", "email", "url", "type", "identifier"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will take care of the DCAT -> CKAN mapping, but you also need to add the other way around, CKAN -> DCAT. This is done a bit further down in the file here and here.
Additionally, this will cover the legacy way of defining publisher fields based on namespaced publisher_*
extra fields, but going forward we want to support scheming objects ("publisher": {"id": xx, "name": yy, "url": zz}
). This is handled in the euro_dcat_ap_scheming
profile here (for CKAN -> DCAT, the opposite should already be handled by the base profile). Do you mind adding support for the new field there as well? And if you extend this test to cover the new field, even better.
@hcvdwerf Let me know if all this makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tnx for the comments. Verry helpful @amercader . Jut commit the changes
64ace0a
to
e0becf6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small fix for failing e2e test.
Co-authored-by: Mark <markusjanse@gmail.com>
Thanks @hcvdwerf ! |
The system previously lacked support for storing and serializing publisher identifiers, such as ROR IDs, ORCIDs, or other unique identifiers for publishers. While other metadata fields like publisher name, email, and URL were being correctly parsed and serialized, the identifier field (e.g., dct:identifier) was not being included in the publisher’s metadata. This omission resulted in incomplete RDF metadata and affected datasets that needed to capture a unique reference to the publisher.