-
-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gen tabular datasets #260
Gen tabular datasets #260
Conversation
Codecov ReportBase: 39.17% // Head: 38.68% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #260 +/- ##
==========================================
- Coverage 39.17% 38.68% -0.49%
==========================================
Files 92 92
Lines 6096 6080 -16
==========================================
- Hits 2388 2352 -36
- Misses 3708 3728 +20
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
For documentation just describe what the function generates. Something like: "This function generates a random, multi-target dataset with the specified number of rows, features, and targets. The values of the features and targets are determined using the provided distributions". You can just put the example into the function docs as a doctest. |
cool I'll do that tomorrow. |
Goal:
This PR aims to make writing benchmarking code a bit easier for developers by parametrizing the datasets dimension.
Additions to linfa-datasets:
resolves #259
Example
Associated output
the feature array can look like this with a Laplace distribution so we are able to handle both continuous and discrete cases
Adjustments to replicate with the above example
Outstanding Questions