NVIDIA RecSys Examples is a collection of optimized recommender models and components.
The project includes:
- Examples for large-scale HSTU ranking and retrieval models through TorchRec and Megatron-Core integration
- HSTU (Hierarchical Sequential Transduction Unit) attention operator support
- Dynamic Embeddings with GPU acceleration
- [2025/10/20] 🎉v25.09 released!
- Integrated prefetching and caching into the HSTU training example.
- DynamicEmb now supports distributed embedding dumping and memory scaling.
- Added kernel fusion in the HSTU block for inference, including KVCache fixes.
- HSTU attention now supports FP8 quantization.
- [2025/9/8] 🎉v25.08 released!
- Added cache support for dynamicemb, enabling seamless hot embedding migration between cache and storage.
- Released an end-to-end HSTU inference example, demonstrating precision aligned with training.
- Enabled evaluation mode support for dynamicemb.
- [2025/8/1] 🎉v25.07 released!
- Released HSTU inference benchmark, including paged kvcache HSTU kernel, kvcache manager based on trt-llm, CUDA graph, and other optimizations.
- Added support for Tensor Parallelism in the HSTU layer.
More
- **[2025/7/4]** 🎉v25.06 released! - Dynamicemb lookup module performance improvement and LFU eviction support. - Pipeline support for HSTU example, recompute support for HSTU layer and customized cuda ops for jagged tensor concat.- [2025/5/29] 🎉v25.05 released!
- Enhancements to the dynamicemb functionality, including support for EmbeddingBagCollection, truncated normal initialization, and initial_accumulator_value for Adagrad.
- Fusion of operations like layernorm and dropout in the HSTU layer, resulting in about 1.2x end-to-end speedup.
- Fix convergence issues on the Kuairand dataset.
The examples we supported:
Please see our contributing guidelines for details on how to contribute to this project.
Join our community channels to ask questions, provide feedback, and interact with other users and developers:
- GitHub Issues: For bug reports and feature requests
- NVIDIA Developer Forums
If you use RecSys Examples in your research, please cite:
@Manual{,
title = {RecSys Examples: A collection of recommender system implementations},
author = {NVIDIA Corporation},
year = {2024},
url = {https://github.com/NVIDIA/recsys-examples},
}
For more citation information and referenced papers, see CITATION.md.
This project is licensed under the Apache License - see the LICENSE file for details.