Folder | Notebook Title | Description |
---|---|---|
XGBoost | XGBoost Demo | This notebook shows the acceleration one can gain by using GPUs with XGBoost in RAPIDS. |
The cuML notebooks showcase how to use the machine learning algorithms implemented in cuML along with the advantages of using cuML over scikit-learn. These notebooks compare the time required and the performance of the algorithms. Below are a list of such algorithms:
Folder | Notebook Title | Description |
---|---|---|
cuML | dbscan_demo | This notebook showcases density-based spatial clustering of applications with noise (dbscan) algorithm using the fit and predict functions |
cuML | knn_demo | This notebook showcases k-nearest neighbors (knn) algorithm using the fit and kneighbors functions |
cuML | Linear Regression Demo | This notebook includes code example for linear regression algorithm and it showcases the fit and predict functions. |
cuML | Ridge Regression Demo | This notebook includes code examples of ridge regression and it showcases the fit and predict functions. |
cuML | Coordinate Descent | This notebook includes code examples of lasso and elastic net models. These models are placed together so a comparison between the two can also be made in addition to their sklearn equivalent. |
cuML | pca_demo | This notebook showcases principal component analysis (PCA) algorithm where the model can be used for prediction (using fit_transform ) as well as converting the transformed data into the original dataset (using inverse_transform ). |
cuML | tsvd_demo | This notebook showcases truncated singular value decomposition (tsvd) algorithm which like PCA performs both prediction and transformation of the converted dataset into the original data using fit_transform and inverse_transform functions respectively |
cuML | sgd_demo | The stochastic gradient descent algorithm is demostrated in the notebook using fit and predict functions |
cuML | umap_demo | The uniform manifold approximation & projection algorithm is compared with the original author's equivalent non-GPU \Python implementation using fit and transform functions |
cuML | umap_demo_graphed | Demonstration of cuML uniform manifold approximation & projection algorithm's supervised approach against mortgage dataset and comparison of results against the original author's equivalent non-GPU \Python implementation. |
cuML | umap_demo_supervised | Demostration of UMAP supervised training. Uses a set of labels to perform supervised dimensionality reduction. UMAP can also be trained on datasets with incomplete labels, by using a label of "-1" for unlabeled samples. |
cuML | random forest | This notebook includes code examples of Random Forest and it showcases the fit and predict functions. |
Folder | Notebook Title | Description |
---|---|---|
cuDF | notebooks_Apply_Operations_in_cuDF | This notebook showcases two special methods where cuDF goes beyond the Pandas library: apply_rows and apply_chunk functions. They utilized the Numba library to accelerate the data transformation via GPU in parallel. |
cuDF | notebooks_numba_cuDF_integration | This notebook showcases how to use Numba CUDA to accelerate cuDF data transformation and how to step by step accelerate it using CUDA programming tricks |
Folder | Notebook Title | Description |
---|---|---|
cuGraph | Louvain | Demonstration of using cuGraph to identify clusters in a test graph using the Louvain algorithm |
cuGraph | Vertex-Similarity | Demonstration of using cuGraph to compute vertex similarity using both the Jaccard Similarity and the Overlap Coefficient. |
cuGraph | Weighted-Jaccard | Demonstration of using cuGraph to compute the Weighted Jaccard Similarity metric on our training dataset. |
cuGraph | Renumber | Demonstrate of using the renumbering features to assigned new vertex IDs to the test graph. This is useful for when the data sets is non-contiguous or not integer values |
cuGraph | BFS | Demonstration of using cuGraph to computer the Bredth First Search space from a given vertex to all other in our training graph |
cuGraph | SSSP | Demonstration of using cuGraph to computer the The Shortest Path from a given vertex to all other in our training graph |
cuGraph | Spectral-Clustering | Demonstration of using cuGraph to identify clusters in a test graph using Spectral Clustering using both the (A) Balance Cut and (B) the Modularity Maximization quality metrics |
cuGraph | Pagerank | Demonstration of using both NetworkX and cuGraph to compute the PageRank of each vertex in our test dataset |
cuGraph | Triangle Counting | Demonstration of using both NetworkX and cuGraph to compute the the number of Triangles in our test dataset |
Folder | Notebook Title | Description |
---|---|---|
Tutorials | DBSCAN_demo_full | Demonstration of how to use DBSCAN - a popular clustering algorithm - and how to use the GPU accelerated implementation of this algorithm in RAPIDS. |
Folder | Script Title | Description |
---|---|---|
Utils | start-jupyter.sh | starts a JupyterLab environment for interacting with, and running, notebooks |
Utils | stop-jupyter.sh | identifies all process IDs associated with Jupyter and kills them |
Utils | dask-cluster.py | launches a configured Dask cluster (a set of nodes) for use within a notebook |
Utils | dask-setup.sh | a low-level script for constructing a set of Dask workers on a single node |
Utils | split-data-mortgage.sh | splits mortgage data files into smaller parts, and saves them for use with the mortgage notebook |
Folder | Document Title | Description |
---|---|---|
Docs | ngc-readme | |
Docs | dockerhub-readme |
-
The
cuml
folder also includes a small subset of the Mortgage Dataset used in the notebooks and the full image set from the Fashion MNIST dataset. -
utils
: contains a set of useful scripts for interacting with RAPIDS -
For additional, community driven notebooks, which will include our blogs, tutorials, workflows, and more intricate examples, please see the Notebooks Extended Repo