-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attribution does not cover most MARC relators and other roles #1521
Comments
@janvoskuil I do not see that prov:Attribution "implies by definition that the attributee was directly involved in the activity that generated the entity and bore some responsibility for that activity". |
@makxdekkers In an attribution relation wasAttributedTo(id; e, ag, attrs), ag is "the identifier (ag) of the agent whom the entity is ascribed to, and therefore bears some responsibility for its existence" (PROV-DM). A dedicatee (MARC relator) cannot be said to be an agent to whom the entity was attributed. Same for "collector" and many, many other MARC relators. Or take 'stakeholder' (CI Role). |
@janvoskuil I see your point, but there is some subjectivity there. For example, you state that "associated with -- or to as the definition says -- means direct involvement". I do not fully agree with that implication. The PROV definition keeps the activity at arms-length from the agent -- there was some activity but we may not know what it was, which is in some way which we may not know or care about, associated to the agent. |
Some subjectivity is involved, true --- but necessarily so, because PROV is not very much formalized. Wiggle room exists. We do know, though, that the implicit activity in an attribution was an activity that generated the entity and that was associated with (or to) the agent. The activity is, indeed, kept at arms length because it is unknown or irrelevant: agree. But the implication is clear: an attributee is a creator because they were involved as agents in (and hence responsible for) generating the entity. No example of attribution in the PROV documents or the PROV book involves a non-creator role. A funder of a research paper or dataset is not responsible for the existence the paper or the dataset: research is generally seen as being independent of funders. Collector is defined by MARC as in [1]: explicitly not a creator. That funders and collectors have clear roles and responsibilities is not the point: they are not involved in the activity that generated the enity. In all fairness, many MARC relators are not attributions but not involvements either --- location, or judge, for instance. However, many non-attribution roles are useful to include in metadata. DCAT resources include other 'cataloged resources' than just datasets. In our case, we use DCAT resources (among other resources) to describe millions of documents from thousands of sources (per year). We need to create a concept scheme for roles. This will include roles like 'addressee', 'receiver', 'approver', 'commentator', 'stakeholder'. A broader problem with attribution is that it implies agent involvement. The resource prov:Agent is a role: you are only a prov:Agent with respect to the activities that you are responsible for. Dedicatees, receivers and stakeholders are not agents with respect to an activity that is identical to or directly or indirectly connected to the generation activity. See the article referenced above for a more detailed analysis. [1] A curator who brings together items from various sources that are then arranged, described, and cataloged as a collection. A collector is neither the creator of the material nor a person to whom manuscripts in the collection may have been addressed (url) |
@janvoskuil I really disagree with your statement: "That funders and collectors have clear roles and responsibilities is not the point: they are not involved in the activity that generated the enity". In my opinion, without the funders and collectors the dataset might not even exist. They may not be 'creators' of the material but their role, the way I see it, is certainly important enough to get attribution. But again, it's quite subjective. |
@lucmoreau or @pgroth can probably enlighten us. Is it abusing |
PROV is a very high level model. It recognises the following upper classes
These have the following standard relationships between them
That's essentially it. There are some elaborations, mostly involving association classes, but those terms are intended to encompass all the potential detailed relationships involving those pairs of classes. In particular, |
I think the most important point here is the distinction between agent --- which PROV defines as a (non-rigid) role concept --- and entity. Some roles in CI Roles, MARC and other role taxonomies qualify as agent roles (creator, collector, funder), whereas others qualify as usages (addressee, recipient, dedicatee). In some cases, it can go either way, depending on the specifics of each individual case (stakeholder). I totally agree with Makx's point that people responsible for assigning metadata should not have to understand all these finer distinctions. They shouldn't have to choose, because in practice, different people will make different choices in similar situations, leading to a loss of generality. Therefore, to accommodate the incorporation of said role taxonomies in a metadata scheme based on DCAT and PROV, it would be useful to be able to use a generic relation that expressly abstracts away from all these and related distinctions. This will be a relation between arbitrary entities and entities that can bear responsibility, i.e., instances of prov:Entity and of prov:Agent that are also instances of foaf:Agent. Or, simply put, a relation between instances of prov:Entity and instances of foaf:Agent. Making it necessary to choose different relations for different role concepts in MARC etc introduces semantic redundancy and is a source of confusion. |
@janvoskuil Would |
Yes, I think so, but it would need to be part of an L2-relation because we have the entity, the foaf:Agent (prov:agent or 'volitional entity'), and the concept. dcat:Relationship (with dct:relation) could be generalized. There is a strong intuitive difference, though, between "translation of" en "version of" on the one hand (which are typical dcat:Relationships, between entities and 'non-volitional entities'), and relations like "recipient", "stakeholder". Perhaps we could use dct:relation in the context of dcat:Relationship to cover both cases, and let the concept do its work. Users could use different concept schemes or a concept hierarchy to indicate the difference more clearly. Alternatively, or in addition, there could be two subproperties, say, 'relatedObject' for "translationOf" and 'relatedAgent' for "adressee" (with 'agent' in the foaf-sense). This would perhaps be the most elegant solution. It would generalize the definition of dcat:Relationship from "An association class for attaching additional information to a relationship between DCAT Resources" to something like "... a relationship between a DCAT resource and another resource, such as a foaf:Agent or another DCAT resource". Also, the superclass would then be prov:Influence rather than "Entity influence". |
|
[1] |
@kcoyle As far as I understand RDF modelling, the moment you describe a person with RDF assertions, the persons 'becomes' an RDF resource. So in my mind, you should be able to assert |
I tried to follow the dicussion here, but I somehow lost track. And upfront my apologies for the extensive comment. Note that already dct:relation is part of DCAT ( https://w3c.github.io/dxwg/dcat/#Property:resource_relation) as a generic relationship between two Catalog Resources. That is the mapping choice that has been made. If we are going to apply dct:relation also for the second case then what is the difference between them? I would even drop dct:relation at all, because then it becomes the universal property in DCAT that relates anything with anything. So what is the added value to add this to DCAT then? Changing the semantics of the terms "qualifiedRelation" and "qualifiedAttribution" because the mapping on prov-O is incorrect is not a good idea: then one should introduce a new property dcat:qualifiedAttribution. The next paragraphs are a longer exposé trying to understand the issue in my words. Then on the argument that the reuse of prov-O in the DCAT context is incorrect or inappropriate. I do, however, believe that in the main target usage case area of DCAT, woodcutter is not a role people have in mind. One remark in the whole thread also struck me: you make explicit distinction between foaf:Agent and prov:Agent as disjoint classes. I never made that, and never felt the need for that. And maybe also that is an indication of the issue that you are raising: in the core DCAT usage context most implementers would not make (or like to make) a distinction. What if DCAT would assume these are synonyms? What would be the impact? An additional thought in this area: it seems that this discussion forgets there is underlying DCAT the Semantic Web. This means that one can extend the specification at will, but that the extensions might not be understood by others. For me there is no difference between having marc:woodcutter as subproperty of dct:relation or marc:woodcutter as role in qualifiedAttribution. Both are equally not understood by DCAT processors. The above actually means that the qualifiedRelation and qualifiedAttribution as generic pattern actually are builtin the RDF Semantic Web language. My point is that the above prov-O analysis might be non-existing topic as the usage of qualifiedRelation and qualifiedAttribution is not required to achieve the objectives of expandability, and semantical clarity of relationships between entities. The Semantic Web already offers that. |
@makxdekkers are we discussion In my extensive comment I already mentioned it. The DCAT community has introduced for the semantic notion of relating 2 resources the property Any broader usage allowed by Note that this reasoning holds for every DCAT term that is mapped on an URI that is not within the DCAT domain. That is the nature of reuse. So while I agree that dct:relation could be used to relate a DCAT Resource with an Agent, it has been chosen to use But we have to be modest and realise that whatever choice is made here will not make the life of an implementer easier. It will be up to the implementers to make sense out of it, as any modeling choice, even guided by the current DCAT, is very open. In this area, DCAT could also decide to not interfere and only provide the guideline: this is profile matter. The usage note is almost stating it. |
Let me recapitulate. The problemThe original problem is that DCAT offers no way to express relations between resources and agents (in the sense of volitional things such as persons and organizations) when the agent bears no responsibility for the existence of the resource.
Independent problems in the backgroundIn the background of this original problem there are many (indirectly) related problems that can and should be put aside to keep the discussion focussed. One problem is that Another problem in the background is where to draw the line exactly between being and not being responsible for the existence of an entity. Personally, I would include painter but not collector (of a painting), but then there are others who argue that inclusion of collectorship is totally in the spirit of Proposed solution 1The best solution seems to define a new L2 DCAT relation that covers all and only resource-agent relations, called Then, This solution is elegant because it does not change existing notions and has minimal impact. A clarifying note can explain very briefly and clearly why Proposed solution 2Another solution would be to generalize the notion of As in Proposed solution 1, Option 1 Option 2 Given this complication, I think that I would prefer Proposed solution 1. |
'Party' is sometimes used as a superclass of Person and Organization. However, it will be hard to get the nomenclature in PROV changed. |
thanks for the summary.
I am not going to argue about the correctness of the above analysis, Link to an Agent having some form of responsibility for the resource to Link to an Agent Removing the additional restrictive scoping of responsibility. Then the DCAT community should answer the question: "what was the motivation to solely include responsability roles?".
my assessment of the options: option 1) I object to just change the mapping for qualified attribution as suggested in option 1 as it will coincide with the semantics of https://w3c.github.io/dxwg/dcat/#Property:resource_relation
We should not introduce ambiguity between the RDF representation (2 expectations for the same URI) in the specification. If with option 1 you mean
Then option 1 is possible, that is coherent for the DCAT spec, but it is a substantial rewrite of the specification. option 2) For me there are also the following options: option 3) As I tried to explain: the semantic value of these relationships at the DCAT level is very very limited. option 4)
and the answer to your use case is: this is part of the profile you are building. DCAT could make this explicit in a usage note and state that DCAT only focuses on roles with responsibility aspects. option 5)
The definition of prov:Attribution is Attribution is the ascribing of an entity to an agent. When an entity e is attributed to agent ag, entity e was generated by some unspecified activity that in turn was associated to agent ag. Thus, this relation is useful when the activity is not known, or irrelevant. So if you very permissively read generated by and unspecified activity conclusion From the five options my preference goes to option 3), because it removes this abstract discussion from DCAT as it is anyhow profile matter for me, or option 5) a minimal impact change. |
Thanks @bertvannuffelen for your thoughts.
To be precise, what I propose is that DCAT includes vocabulary to express resource-agent relations in which the agent does not bear responsibility for (an implicit or explicit activity that lead to) the existence of the resource. Concretely, this would include relations like dedicatee, addressee and recipient.
As for the question "what was the motivation to solely include responsability roles?", I think the answer is oversight. The DCAT documentation advertises I agree that application profiles remain important in all cases. However, this is not about adding detail and restrictions to generic DCAT vocabulary, it is about filling a gap. Moreover, it is a gap that can easily lead people to misuse As for @dr-shorthair's suggestion to use the term party: I like this suggestion but I have second thoughts because 'party' has, in English, the exact same ambiguity as 'agent': it can be used as a relational term ("A and B are parties to this contract"). A term like 'volitional being' would be less prone to relational readings. I think that, in the end, it is best to stick to |
Project/Milestone modified. Explanation: As DCAT v3 moves through review and hopefully ratification, we want to make sure that open issues and feedback that have yet to be completely addressed are properly recorded and tagged/assigned in github to both clarify their status and to help review and prioritise as a source of improvements and new requirements in future DCAT versions |
Status:
Identifier:
Creator: janvoskuil
Deliverable(s): DCAT 2, DCAT 3 specification
Stakeholders
data publisher
Problem statement
Paragraph 13.1 of DCAT 2 and DCAT 3 states that MARC relators like 'funder', 'custodian' etc can be expressed using prov:Attribution. This is not correct, since prov:Attribution implies by definition that the attributee was directly involved in the activity that generated the entity and bore some responsibility for that activity.
Existing approaches
In one of my assignments, we used an alternative approach by definining x:Involvement, x:involved and x: qualifiedInvolvement. See the link for detailed discussion
Links
https://www.linkedin.com/pulse/modelling-marc-relators-using-prov-how-fix-dcat-2-jan-voskuil
Requirements
Supply vocabulary to enable data publishers to incorporate MARC relators and other roles in metadata structures in a unified and simple way
Related use cases
Comments
The text was updated successfully, but these errors were encountered: