-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have attributes of training dataset in the repository #266
Comments
I agree it would be useful to have this information. Some questions I would have:
Of course, we don't have to have everything right from the start, but we should have an idea of what this addition would entail. And to me, it looks like it's far from trivial. |
I think it'd make sense to have this in the README as a part of the model card, we can have some method to generate as much info as we can from a given input dataframe for example. |
I think the reason why Merve wanted to have them in the |
I see, for that I'm happy for that to be in a |
@adrinjalali I agree. |
@merveenoyan I'm happy to take this if it still needs to be done! |
@BenjaminBossan I'm happy to take this one but had a few thoughts/questions:
|
Thanks for taking an interest in the issue. I think there is no definite answer to your question. The initial motivation is to know in advance what options exist for categorical data to improve the widget, but I think Adrin made a good point about file size, which can easily get large if we just record all distinct values, so some kind of compromise would need to be found. Also, for this feature to make sense, we would need to do work on the widget side as well, for which there is currently no capacity AFAIK, so I would rather not work on this feature right now. |
@BenjaminBossan Sounds good! Is there another issue I could help out with? |
If this is something you're willing to jump into, I think we have some room to improve the skops.io persistence format. For instance, support for me external libraries could be added, like scikeras (#388) or skorch :) |
The widget is cool and everything but it's hard to see all the unique values of categorical variables, which variables are categorical or the range for continuous columns. Couple of solutions:
Ping @skops-dev/maintainers
The text was updated successfully, but these errors were encountered: