-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
representation and linking of Code lists #11
Comments
The best approach will vary. Taking the ISO country code as an example, it is a flat set with small number of members, each of them is 2-letter relatively easy recognizable by human. New country codes are rarely being added or removed, but when it happens it is the only major challenge for digital data exchange, and not directly related to the representation of the codes in json-ld payload. So I can't immediately see any significant disadvantage of simple 2-letter string representation. For the different example, the UNECE Rec 21 codes for types of cargo seems to conflate multiple cargo properties like package shape, size, material, fragility into one flat list of completely arbitrary 2-letter codes, which is hard to understand for human, and hard to process in the application business logic. I believe it would be much easier to use if it was divided into few code lists of distinct cargo package attributes, so that cargo data could be represented with json(+ld) like this: {
"edi3:package": {
"@type": [ "rec21:BasePackage", "rec21:Flexibag" ],
"edi3:material": [ "rec21:steel", "rec21:plastic" ],
"edi3:fragilityClass": "rec21:FG0"
}
} The properties i used in the example above may not be the most appropriate, it is only to demonstrate that structured like this the data will be a lot easier to consume, comprehend and implemented in the application business logic. Realizing that such overhaul of UNECE codelists will not happen soon, the existing one could probably be helped by adding http url for codelist members, e.g. http://unece.org/codelists/rec21#FE or maybe even more human-readable one, like http://unece.org/codelists/rec21#Case_with_pallet_base_cardboard. Dereferencing this url in the web browser should result in html page describing this codelist member. Dereferencing this url with http header {
"@context": {
"edi3": "https://edi3.org/vocab#",
"rec21": "https://unece.org/codelists/rec21#"
}
"@graph": [
{
"@id": "rec21:Case_with_pallet_base_cardboard",
"@type": "edi3:UNECERec21Code",
"rdfs:comment": "Case, with pallet base, cardboard",
"rdf:value": "EF"
},
...
]
} |
Yes it's true that some of the UN code-lists are a confused mish-mash of codes that describe different properties (package codes and status codes are good examples). fixing that is a job for a governance cycle in a later phase. Some are ok like units of measure codes. For now let's focus on how to represent a code list, whether it is semantically good or bad. Some further comments & questions:
Units of Measure
can we see some examples of how to handle arbitrary properties like "symbol" in the UOM codes and also hierarchies like in the WTO tariff codes? |
I would prefer to see identifier in the data that I can make some sense of without having to consult any additional resource. Not sure about the governance of Rec21 descriptions, but if it already has a policy to keep the description short (which they seem to have), then it only needs to be made unique, which could be achieved by appending the actual unique 2-character code to the end of identifier: rec21:Case_with_pallet_base_cardboard_EF.
{
"@id": "rec20:kilogram_per_square_meter",
"@type": "edi3:NormativeUnit",
"rdf:label": "kilogram per square metre",
"rdf:comment": "Unit of surface density, areic mass",
"edi3:uneceRec20Code": "28",
"edi3:conversionFactor": "kg/m²",
"edi3:unitSymbol": "kg/m²"
} Level\Category, which actually indicates normative status should probably go into the @type: 1 == NormativeUnit, 2 == NormativeEquivalentUnit, 3 == InformativeUnit, which are all subclasses of edi3:MeasurementUnit
Harmonized system is broken just like UNECE rec21, but at a larger scale. It mixes materials, practical applications, size, shape, enviromental threats and lots of other attributes into a tree, which have such domain-mixing misconceptions at all its levels. It seems pointles to directly model it with heirarchy of rdfs classes. So I believe we can treat it as flat list just like rec20 in the example above. |
I'd like to add one more dimension to this discussion, as it is very relevant in some projects I am dealing with currently:
And for IDs e.g.
The current NDR omits all of them. I understood your arguments that those are not needed any more, as a URN is used for identification. From my point of view this makes perfectly sense. But it leads to some consequences:
But then the documentation must define this and the UN-code lists have to be aligned. So, the rdf:value would only be reduced to a "historical documentation purpose" and a mapping help for legacy systems. And the @id is the relevant part to use. The tricky part starts then with the use of non-CEFACT code lists and globally specified IDs. To give you two examples:
Any comments or ideas to solve this? |
The supply chain reference data model is full of properties that have enumerated value domains. Country codes (eg "AU" = "Australia") and units of measure (eg "KG" = "Kilogram") are examples that we can all relate to but there are many other critical codes like incoterms, locodes, and others. Taken together these are of equal importance semantically to the reference data model itself.
So - two questions:
rdfs:range: "xsd:token"
. could that / should that instead reference a URI of the country code list? if so what would it look like?The text was updated successfully, but these errors were encountered: