-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset card template overhaul #1708
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Codecov ReportAll modified lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1708 +/- ##
==========================================
+ Coverage 82.30% 82.42% +0.12%
==========================================
Files 62 62
Lines 7226 7226
==========================================
+ Hits 5947 5956 +9
+ Misses 1279 1270 -9 ☔ View full report in Codecov by Sentry. |
@@ -77,7 +77,7 @@ Use the code below to get started with the model. | |||
|
|||
### Training Data | |||
|
|||
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me from a super quick glance
I assume you have checked it still works in DatasetCard.from_template
?
cc @EziOzoani @meg-huggingface too on dataset documentation. cc @yjernite too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening the PR @mariosasko! Looks good to me as well but from a non-expert point of view.
I posted a message in slack (private) as well. To ease the review, here are the actual files (I found it easier to review than the diff):
We have tests in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you for this!!
Adding some minor changes in wording/spelling.
Co-authored-by: meg <90473723+meg-huggingface@users.noreply.github.com>
Ready for the final review :)! (CI failures are unrelated to the changes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So happy for this, thanks for the work!
Great, I'm merging it and will release it this week :) |
* Dataset card template overhaul * Add Privacy Considerations subsection * Apply suggestions from code review Co-authored-by: meg <90473723+meg-huggingface@users.noreply.github.com> * Address more comments --------- Co-authored-by: meg <90473723+meg-huggingface@users.noreply.github.com> Co-authored-by: Lucain <lucainp@gmail.com>
It seems to me that new model releases are more likely to have a filled-out (or fewer blank and "out-of-template" sections/fields) repo card than new dataset releases.
So, this PR introduces the following changes to the dataset card template to make working with it less cumbersome and improve the situation (hopefully):
Dataset Structure
section) and fields (e.g.,homepage
,leaderboard
, etc.)