Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove examples Part 1 - Rossmann, RecSys2020, Outbrain #1669

Merged
merged 9 commits into from
Sep 27, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion ci/test_integration.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ config="-rsx --devices $2"
# Run tests for training containers
pytest $config tests/integration/test_criteo.py
pytest $config tests/integration/test_movielens.py
pytest $config tests/integration/test_rossman.py

# Run tests for specific containers
if [ "$container" == "merlin-hugectr" ]; then
Expand Down
2 changes: 1 addition & 1 deletion docs/source/resources/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,4 +71,4 @@ We can easily convert this workflow definition into a graph, and visualize the f
```
![NVTabular Workflow Graph](/images/nvt_workflow_graph.png)

The Rename operator can be used to change the names of columns. This operator provides several different options for renaming columns such as applying a user defined function to get new column names, as well as appending a suffix to each column. You can see the [Outbrain](https://github.com/NVIDIA/NVTabular/tree/new_api/examples/wnd_outbrain) example for usage of the Rename operator.
The Rename operator can be used to change the names of columns. This operator provides several different options for renaming columns such as applying a user defined function to get new column names, as well as appending a suffix to each column. Refer to the API documentation for the {class}`Rename <nvtabular.ops.Rename>` operator.
15 changes: 0 additions & 15 deletions docs/source/toc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,6 @@ subtrees:
title: Serve a HugeCTR Model
- file: examples/getting-started-movielens/04-Triton-Inference-with-TF.ipynb
title: Serve a TensorFlow Model
- file: examples/advanced-ops-outbrain/index.md
title: Advanced Ops with Outbrain
entries:
- file: examples/advanced-ops-outbrain/01-Download-Convert.ipynb
- file: examples/advanced-ops-outbrain/02-ETL-with-NVTabular.ipynb
- file: examples/advanced-ops-outbrain/03-Training-with-TF.ipynb
- file: examples/scaling-criteo/index.md
entries:
- file: examples/scaling-criteo/01-Download-Convert.ipynb
Expand All @@ -47,19 +41,10 @@ subtrees:
title: Serve a HugeCTR Model
- file: examples/scaling-criteo/04-Triton-Inference-with-TF.ipynb
title: Serve a TensorFlow Model
- file: examples/tabular-data-rossmann/index.md
title: Applying Techniques to Rossmann Stores Data
entries:
- file: examples/tabular-data-rossmann/01-Download-Convert.ipynb
- file: examples/tabular-data-rossmann/02-ETL-with-NVTabular.ipynb
- file: examples/tabular-data-rossmann/03-Training-with-TF.ipynb
- file: examples/tabular-data-rossmann/03-Training-with-PyTorch.ipynb
- file: examples/multi-gpu-movielens/index.md
entries:
- file: examples/multi-gpu-movielens/01-03-MultiGPU-Download-Convert-ETL-with-NVTabular-Training-with-TensorFlow.ipynb
- file: examples/multi-gpu-toy-example/multi-gpu_dask.ipynb
- file: examples/winning-solution-recsys2020-twitter/01-02-04-Download-Convert-ETL-with-NVTabular-Training-with-XGBoost.ipynb
title: Winning Solution of the RecSys2020 Competition
- file: api
title: API Documentation
- file: resources/index
Expand Down
3 changes: 1 addition & 2 deletions docs/source/training/tensorflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,5 +113,4 @@ a callback can be used for it.
history = model.fit(train_dataset_tf, callbacks=[validation_callback], epochs=5)

You can find additional examples in our repository such as
`MovieLens <../examples/getting-started-movielens/>`__ and
`Outbrain <../examples/advanced-ops-outbrain/>`__.
`MovieLens <../examples/getting-started-movielens/>`__.
16 changes: 2 additions & 14 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,26 +38,14 @@ The MovieLens25M is a popular dataset for recommender systems and is used in aca
- Using the NVTabular dataloader with the TensorFlow Keras model
- Using the NVTabular dataloader with PyTorch

### 2. [Advanced Ops with Outbrain](https://github.com/NVIDIA/NVTabular/tree/main/examples/advanced-ops-outbrain)

The [Outbrain dataset](https://www.kaggle.com/c/outbrain-click-prediction) is based on a Kaggle Competition in which Kagglers were challenged to predict which ads and other forms of sponsored content that their global users would click. This example notebook demonstrates how to use the available NVTabular operators, write a custom operator, and train a Wide&Deep model with the NVTabular dataloader in TensorFlow.

### 3. [Scaling Large Datasets with Criteo](https://github.com/NVIDIA/NVTabular/tree/main/examples/scaling-criteo)
### 2. [Scaling Large Datasets with Criteo](https://github.com/NVIDIA/NVTabular/tree/main/examples/scaling-criteo)

[Criteo](https://ailab.criteo.com/download-criteo-1tb-click-logs-dataset/) provides the largest publicly available dataset for recommender systems with a size of 1TB of uncompressed click logs that contain 4 billion examples. This example notebook demonstrates how to scale NVTabular, use multiple GPUs and multiple nodes with NVTabular for ETL, and train a recommender system model with the NVTabular dataloader for PyTorch.

### 4. [Multi-GPU with MovieLens](https://github.com/NVIDIA/NVTabular/tree/main/examples/multi-gpu-movielens)
### 3. [Multi-GPU with MovieLens](https://github.com/NVIDIA/NVTabular/tree/main/examples/multi-gpu-movielens)

In the Getting Started with MovieLens example, we explain the fundamentals of NVTabular and its dataloader, HugeCTR, and Triton Inference. With this example, we revisit the same dataset but demonstrate how to perform multi-GPU training with the NVTabular dataloader in TensorFlow.

### 5. [Winning Solution of the RecSys2020 Competition](https://github.com/NVIDIA/NVTabular/tree/main/examples/winning-solution-recsys2020-twitter)

Twitter provided a dataset for the [RecSys2020 challenge](http://www.recsyschallenge.com/2020/). The goal was to predict user engagement based on 200M user-tweet pairs. This example notebook demonstrates how to use NVTabular's available operators for feature engineering and train a XGBoost model on the GPU with dask.

### 6. [Applying the Techniques to other Tabular Problems with Rossmann](https://github.com/NVIDIA/NVTabular/tree/main/examples/tabular-data-rossmann)

Rossmann operates over 3,000 drug stores across seven European countries. Historical sales data for 1,115 Rossmann stores are provided. The goal is to forecast the **Sales** column for the test set. Kaggle hosted it as a [competition](https://www.kaggle.com/c/rossmann-store-sales/overview).

## Running the Example Notebooks

You can run the example notebooks by [installing NVTabular](https://github.com/NVIDIA/NVTabular#installation) and other required libraries.
Expand Down
Loading