MLCroissant: add fields such as creator, keyword and dates #370
josvandervelde
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The current mlcroissant implementation seems to focus mostly on machine-readability. Fields such as
creator
are not implemented (seeMetadata
for the implemented fields). This makes it difficult to use themlcroissant
validation if you want to add those fields.The Croissant specs mention that "Croissant is based on schema.org, and builds on its Dataset vocabulary."
Should the omitted schema.org fields be added to the
mlcroissant
implementation? Or only a select few, for instance the properties recommended by Google?I would suggest allowing all schema.org/Dataset fields to be added when building a
mlcroissant
Metadata
object. To keep themlcroissant
implementation small and focused, we could decide not to validate most fields -contentLocation
would for example take quite some work to validate.I'm interested in your thoughts!
Beta Was this translation helpful? Give feedback.
All reactions