-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tabular regression example #254
Merged
BenjaminBossan
merged 12 commits into
skops-dev:main
from
lazarust:add-tabular-regression-example
Feb 8, 2023
Merged
Changes from 6 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
5588cac
Adds tabular regression example
lazarust 263b033
Updates changes.rst
lazarust f2fbe0e
Uses skops.io instead of pickle
lazarust 5c75d7c
Cleans up text and formatting
lazarust 1ab80e4
Adds Examples to documentation
lazarust bff9f26
Adds all examples from auto example page
lazarust 2ab5941
Updates to link to docs
lazarust 50efed5
Adds link to more examples
lazarust f390b49
Fixes link
lazarust 01cf817
Fixes link with custom text
lazarust 25dc0e6
Merge branch 'main' into add-tabular-regression-example
lazarust c3c853d
Update docs/examples.rst
lazarust File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
.. _examples: | ||
|
||
Examples of interactions with the Hugging Face Hub | ||
================================================== | ||
|
||
- Creating the Model Card: | ||
`Here <https://github.com/skops-dev/skops/blob/main/examples/plot_model_ | ||
card.py>`_ is an example of using skops to create a model card that can | ||
be used on the Hugging Face Hub. | ||
- Putting the Model Card on the Hub: | ||
`Here <https://github.com/skops-dev/skops/blob/main/examples/plot_hf_hub. | ||
py>`_ is an example of using skops to put a model card on the Hugging Face | ||
Hub. | ||
- Tabular Regression: | ||
`Here <https://github.com/skops-dev/skops/blob/main/examples/plot_tabular | ||
_classification.py>`_ is an example of using skops to serialize a tabular | ||
regression model and create a model card and a Hugging Face Hub repository. | ||
- Text Classification: | ||
`Here <https://github.com/skops-dev/skops/blob/main/examples/plot_text_cl | ||
assification.py>`_ is an example of using skops to serialize a text classi | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You accidentally split the word "classification" into 2. |
||
fication model and create a model card and a Hugging Face Hub repository. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,153 @@ | ||
""" | ||
Tabular Regression with scikit-learn | ||
------------------------------------- | ||
|
||
This example shows how you can create a Hugging Face Hub compatible repo for a | ||
lazarust marked this conversation as resolved.
Show resolved
Hide resolved
|
||
tabular regression task using scikit-learn. We also show how you can generate | ||
a model card for the model and the task at hand. | ||
""" | ||
|
||
# %% | ||
# Imports | ||
# ======= | ||
# First we will import everything required for the rest of this document. | ||
|
||
from pathlib import Path | ||
from tempfile import mkdtemp, mkstemp | ||
|
||
import matplotlib.pyplot as plt | ||
import pandas as pd | ||
import sklearn | ||
from sklearn.datasets import load_diabetes | ||
from sklearn.linear_model import LinearRegression | ||
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score | ||
from sklearn.model_selection import train_test_split | ||
from sklearn.pipeline import Pipeline | ||
from sklearn.preprocessing import StandardScaler | ||
|
||
import skops.io as sio | ||
from skops import card, hub_utils | ||
|
||
# %% | ||
# Data | ||
# ==== | ||
# We will use diabetes dataset from sklearn. | ||
|
||
X, y = load_diabetes(return_X_y=True) | ||
X_train, X_test, y_train, y_test = train_test_split( | ||
X, y, test_size=0.2, random_state=42 | ||
) | ||
|
||
# %% | ||
# Train a Model | ||
# ============= | ||
# To train a model, we need to convert our data first to vectors. We will use | ||
# StandardScalar in our pipeline. We will fit a Linear Regression model with the outputs of the scalar. | ||
model = Pipeline( | ||
[ | ||
("scaler", StandardScaler()), | ||
("linear_regression", LinearRegression()), | ||
] | ||
) | ||
|
||
model.fit(X_train, y_train) | ||
|
||
# %% | ||
# Inference | ||
# ========= | ||
# Let's see if the model works. | ||
y_pred = model.predict(X_test[:5]) | ||
print(y_pred) | ||
|
||
# %% | ||
# Initialize a repository to save our files in | ||
# ============================================ | ||
# We will now initialize a repository and save our model | ||
_, pkl_name = mkstemp(prefix="skops-", suffix=".pkl") | ||
|
||
with open(pkl_name, mode="bw") as f: | ||
sio.dump(model, file=f) | ||
|
||
local_repo = mkdtemp(prefix="skops-") | ||
|
||
hub_utils.init( | ||
model=pkl_name, | ||
requirements=[f"scikit-learn={sklearn.__version__}"], | ||
dst=local_repo, | ||
task="tabular-regression", | ||
data=X_test, | ||
) | ||
|
||
if "__file__" in locals(): # __file__ not defined during docs built | ||
# Add this script itself to the files to be uploaded for reproducibility | ||
hub_utils.add_files(__file__, dst=local_repo) | ||
|
||
# %% | ||
# Create a model card | ||
# =================== | ||
# We now create a model card, and populate its metadata with information which | ||
# is already provided in ``config.json``, which itself is created by the call to | ||
# :func:`.hub_utils.init` above. We will see below how we can populate the model | ||
# card with useful information. | ||
|
||
model_card = card.Card(model, metadata=card.metadata_from_config(Path(local_repo))) | ||
|
||
# %% | ||
# Add more information | ||
# ==================== | ||
# So far, the model card does not tell viewers a lot about the model. Therefore, | ||
# we add more information about the model, like a description and what its | ||
# license is. | ||
|
||
model_card.metadata.license = "mit" | ||
limitations = ( | ||
"This model is made for educational purposes and is not ready to be used in" | ||
" production." | ||
) | ||
model_description = ( | ||
"This is a Linear Regression model trained on diabetes dataset. This model could be" | ||
" used to predict the progression of diabetes. This model is pretty limited and" | ||
" should just be used as an example of how to user `skops` and Hugging Face Hub." | ||
) | ||
model_card_authors = "skops_user, lazarust" | ||
citation_bibtex = "bibtex\n@inproceedings{...,year={2022}}" | ||
model_card.add( | ||
**{ | ||
"Model Card Authors": model_card_authors, | ||
"Intended uses & limitations": limitations, | ||
"Citation": citation_bibtex, | ||
"Model description": model_description, | ||
"Model description/Intended uses & limitations": limitations, | ||
} | ||
) | ||
|
||
# %% | ||
# Add plots, metrics, and tables to our model card | ||
# ================================================ | ||
# We will now evaluate our model and add our findings to the model card. | ||
|
||
y_pred = model.predict(X_test) | ||
|
||
# plot the predicted values against the true values | ||
plt.scatter(y_test, y_pred) | ||
plt.xlabel("True values") | ||
plt.ylabel("Predicted values") | ||
plt.savefig(Path(local_repo) / "prediction_scatter.png") | ||
model_card.add_plot(**{"Prediction Scatter": "prediction_scatter.png"}) | ||
|
||
mae = mean_absolute_error(y_test, y_pred) | ||
mse = mean_squared_error(y_test, y_pred) | ||
r2 = r2_score(y_test, y_pred) | ||
model_card.add_metrics( | ||
**{"Mean Absolute Error": mae, "Mean Squared Error": mse, "R-Squared Score": r2} | ||
) | ||
|
||
# %% | ||
# Save model card | ||
# ================ | ||
# We can simply save our model card by providing a path to :meth:`.Card.save`. | ||
# The model hasn't been pushed to Hugging Face Hub yet, if you want to see how | ||
# to push your models please refer to | ||
# :ref:`this example <sphx_glr_auto_examples_plot_hf_hub.py>`. | ||
|
||
model_card.save(Path(local_repo) / "README.md") |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the links here are to the .py files on GH. However, it is better to point them to the rendered documentation. E.g. for text classification, that would be: https://skops.readthedocs.io/en/stable/auto_examples/plot_text_classification.html