Distributed Embeddings

distributed-embeddings is a library for building large embedding based (e.g. recommender) models in Tensorflow 2. It provides a scalable model parallel wrapper that automatically distribute embedding tables to multiple GPUs, as well as efficient embedding operations that cover and extend Tensorflow's embedding functionalities.

Features

Distributed model parallel wrapper

distributed_embeddings.dist_model_parallel is a tool to enable model parallel training by changing only three lines of your script. It can also be used alongside data parallel to form hybrid parallel training. Users can easily experiment large scale embeddings beyond single GPU's memory capacity without complex code to handle cross-worker communication.

Embedding Layers

distributed_embeddings.Embedding combines functionalities of tf.keras.layers.Embedding and tf.nn.embedding_lookup_sparse under a unified Keras layer API. The backend is designed to achieve high GPU efficiency.

See more details at User Guide

Installation

Requirements

Python 3, CUDA 11 or newer, TensorFlow 2

Containers

You can build inside 22.03 or later NGC TF2 image:

docker pull nvcr.io/nvidia/tensorflow:22.03-tf2-py3

Build from source

After clone this repository, run:

make pip_pkg && pip install artifacts/*.whl

Test installation with:

python -c "import distributed_embeddings"

You can also run Synthetic and DLRM examples.

Feedback and Support

If you'd like to contribute to the library directly, see the CONTRIBUTING.md. We're particularly interested in contributions or feature requests for our feature engineering and preprocessing operations. To further advance our Merlin Roadmap, we encourage you to share all the details regarding your recommender system pipeline in this survey.

If you're interested in learning more about how distributed-embeddings works, see documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
distributed_embeddings		distributed_embeddings
docs		docs
examples		examples
tests		tests
third_party		third_party
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.style.yapf		.style.yapf
CLA.md		CLA.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
build_pip_pkg.sh		build_pip_pkg.sh
requirements.txt		requirements.txt
setup.py		setup.py
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Embeddings

Features

Distributed model parallel wrapper

Embedding Layers

Installation

Requirements

Containers

Build from source

Feedback and Support

About

Releases

Packages

Languages

License

skyw/distributed-embeddings

Folders and files

Latest commit

History

Repository files navigation

Distributed Embeddings

Features

Distributed model parallel wrapper

Embedding Layers

Installation

Requirements

Containers

Build from source

Feedback and Support

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages