This repo provides docs and example applications that demonstrate the RAPIDS.ai GPU-accelerated XGBoost-Spark project.
Try one of the Getting Started guides below. Please note that they target the Mortgage dataset as written, but with a few changes to EXAMPLE_CLASS
, trainDataPath
, and evalDataPath
, they can be easily adapted to the Taxi or Agaricus datasets.
You can get a small size datasets for each example in the datasets folder. These datasets are only provided for convenience. In order to test for performance, please prepare a larger dataset by following Preparing Datasets. We also provide a larger dataset: Morgage Dataset (1 GB uncompressed), which is used in the guides below.
- Building applications
- Getting started on on-prem clusters
- Getting started on cloud service providers
- Amazon AWS
- Databricks
- Google Cloud Platform
- Getting started for Jupyter Notebook applications
These examples use default parameters for demo purposes. For a full list please see Supported XGBoost Parameters for Scala or Python
Please see the RAPIDS website for contact information.
This content is licensed under the Apache License 2.0