Speed up your scikit-learn applications for CPUs and GPUs across single- and multi-node configurations
ReleasesΒ Β Β |Β Β Β DocumentationΒ Β Β |Β Β Β ExamplesΒ Β Β |Β Β Β SupportΒ Β Β |Β Β LicenseΒ Β Β
Extension for Scikit-learn is a free software AI accelerator designed to deliver over 10-100X acceleration to your existing scikit-learn code. The software acceleration is achieved with vector instructions, AI hardware-specific memory optimizations, threading, and optimizations.
With Extension for Scikit-learn, you can:
- Speed up training and inference by up to 100x with equivalent mathematical accuracy
- Benefit from performance improvements across different CPU hardware configurations, including GPUs and multi-GPU configurations
- Integrate the extension into your existing Scikit-learn applications without code modifications
- Continue to use the open-source scikit-learn API
- Enable and disable the extension with a couple of lines of code or at the command line
Easiest way to benefit from accelerations from the extension is by patching scikit-learn with it:
-
Enable CPU optimizations
import numpy as np from sklearnex import patch_sklearn patch_sklearn() from sklearn.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) clustering = DBSCAN(eps=3, min_samples=2).fit(X)
-
Enable GPU optimizations
Note: executing on GPU has additional system software requirements - see details.
import numpy as np from sklearnex import patch_sklearn, config_context patch_sklearn() from sklearn.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) with config_context(target_offload="gpu:0"): clustering = DBSCAN(eps=3, min_samples=2).fit(X)
π Check out available notebooks for more examples.
Alternatively, all functionalities are also available under a separate module which can be imported directly, without involving any patching.
-
To run on CPU:
import numpy as np from sklearnex.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) clustering = DBSCAN(eps=3, min_samples=2).fit(X)
-
To run on GPU:
import numpy as np from sklearnex import config_context from sklearnex.cluster import DBSCAN X = np.array([[1., 2.], [2., 2.], [2., 3.], [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32) with config_context(target_offload="gpu:0"): clustering = DBSCAN(eps=3, min_samples=2).fit(X)
To install Extension for Scikit-learn, run:
pip install scikit-learn-intelex
Package is also offered through other channels such as conda-forge. See all installation instructions in the Installation Guide.
The easiest way of accelerating scikit-learn workflows with the extension is through through patching, which replaces the stock scikit-learn algorithms with their optimized versions provided by the extension using the same namespaces in the same modules as scikit-learn.
The patching only affects supported algorithms and their parameters. You can still use not supported ones in your code, the package simply fallbacks into the stock version of scikit-learn.
TIP: Enable verbose mode to see which implementation of the algorithm is currently used.
To patch scikit-learn, you can:
- Use the following command-line flag:
python -m sklearnex my_application.py
- Add the following lines to the script:
from sklearnex import patch_sklearn patch_sklearn()
π Read about other ways to patch scikit-learn.
As an alternative, accelerated classes from the extension can also be imported directly without patching, thereby allowing to keep them separate from stock scikit-learn ones - for example:
from sklearnex.cluster import DBSCAN as exDBSCAN
from sklearn.cluster import DBSCAN as stockDBSCAN
# ...
Acceleration in patched scikit-learn classes is achieved by replacing calls to scikit-learn with calls to oneDAL (oneAPI Data Analytics Library) behind the scenes:
We welcome community contributions, check our Contributing Guidelines to learn more.
* The Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.