-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible improvements for template and converter #124
Comments
Here is a draft for a new xlsx template structure that includes all the changes proposed above and below.
Note that the help sheet was not yet updated. Sheets Help, Version, and About will be removed from the final template. Previous versions:
|
The proposed new template (2nd draft) cannot handle that skos:collections may have not only concepts but also other collections as skos:member. Note, the 0.4.3 template could also not express collection_A memberOf collection_B. |
I like esp. the first three items on the list, @dalito :) |
Hi @dalito, regarding the collection item in the checklist above:
|
Like the green column Q of the example file (2nd draft)? There would be one column per collection in concept sheet. I only added a single column to show the idea. |
Collection in collection would be modeled in collections sheet just like narrower is modeled or concepts in the concept sheet. This is not yet in the 2nd draft IIRC. |
yes, @dalito, and then in the cells the user just adds an "X" (small or capital should be allowed) - boolean would be nicer, but most non-programmers are not so familiar with this concept of True and False ;) |
Hi @dalito, If possible, I would omit the Concept IRI from the Concept tab, completely. The Concept IRIs with the right padding could be automatically generated by the CI-pipeline. As numbering for the IRIs one could then just use the line numbers of the excel sheet. That would simplify the sheet. |
This easily breaks if a label is changed at one place but another is forgotten. In the past there were many problems with misspellings, case, white space or separator use. IDs are the solution to this. It is possible to use indentation for expressing broader/narrower hierarchy between concepts. This requires a local install of voc4cat-tool. I would suggest to install pipx and then use pipx to install voc4cat-tool with |
@markdoerr In the childrenIRI field we could perhaps append the preferred label after each IRI. The label would just be present for convenience but would be stripped off when reading.
|
I updated the first message and put a new (3rd) draft for the template "1.0" to the 2nd message which addresses all issue/ideas that came up until now. |
Thanks @dalito, |
Julia @schumannj proposed in nfdi4cat/voc4cat#113 to replace ChildrenIRI by ParentIRI. For vocabularies where most concepts have parents (like in voc4cat) this makes a lot of sense. But flat concept schemes, in which most concepts are top-level concepts and only few narrower concepts exists, are better expressed with ChildrenIRIs as it is now. So should it be configurable at vocabulary level to use either or? Is the added complexity justified? |
I personally prefer the parent relation in building hierarchies, because each child can have at most one parent (which is simpler than parents having multiple children - like in real life ;). ) For flat hierarchies, the difference is of course not big as @dalito pointed out. But, will we stay "flat" in the future ? Since one never knows, I would opt for the simpler (=ParentIRI) solution, suggested by @schumannj, since in the worst case one only would need to add one parent to a child (poor child). |
Following the thinking that entering broader (parent) is easier than narrower (children), we should also add a column "member of" to the concept sheet in order to enter membership in collections directly at the concept instead of the current way of adding all collection members in the collection sheet. With these two changes, I am no longer convinced that the original idea of merging the "Concepts", "Additional Concept Features", and "Collections" sheets into one sheet should be pursued. Instead I suggest to keep the 3-table-split, because:
Update: I made a new 5th draft for next xlsx template above that integrates these changes. |
In a slightly relevant topic: A single excel cell fits 32767 characters. A URI in voc4cat is 41 characters long. If we add a comma and a blank space in between the URIs, a Collection or a Children cell can fit up to 762 members (780 without the blank space). |
Taken over from nfdi4cat/VocExcel#2
Template
This is for discussing possible future structural changes. This is not urgent but may serve as a checklist to review before the next big template-version step (in descending priority):
Get rid of "Additional Concept Features" sheet because it is hard to work with only numeric IDs without knowing the preferred labels. The columns could be moved to the Concepts sheet.Solved in another way in Idea: Show IRIs with optional label in xlsx IRI-columns #253 already released as part of v0.9.0.<date> <gh-name> <change-note-text>
. This structure will be validated so that correct DC:provenance data can be created for each concept & collection. (related: dcterms:provenance - Correctly used? #122)Add a status column with states proposed/accepted/obsoleteAuto generate skos:historyNote with date & state upon change. Suggested states to track: created, obsoleted because ... (see next point). This information will not be present in Excel but only in turtle.Converter
idranges.toml
. Prefix-sheet was made read-only in Fix and improve handling of prefixes from config #263, released in v0.9.0.Profile
The text was updated successfully, but these errors were encountered: