EmbeddingBag op and layer #2352

Rocketknight1 · 2021-01-18T17:19:12Z

Description

Brief Description of the PR:
This is a PR for the EmbeddingBag op. Please don't merge it yet! Although it works, testing is incomplete and the file structure needs to be cleaned up. I'm opening it now just to get some initial feedback. I'll keep working on several of these issues (particularly 1, 3, 4 and 6 see below), but I'll need some feedback on 2) and 5), plus any other feedback you have for the rest of it!

Fixes # (issue)
#2201

Type of change

New layer and associated C++/CUDA op

Comments

There are a few issues that need to be resolved before I'd feel comfortable with this being merged. In no particular order, they are:

The CUDA/C++ code is split with the forward and backward passes in separate files, which is not how other Tensorflow or Addons ops do it. This is just a style thing - I'll merge them soon.
There are really two different entrypoints for users here, the function/op (analogous to tf.gather) and the layer (analogous to tf.keras.layers.Embedding). Like Embedding, the layer instantiates its own embeddings tensor and expects to be passed only indices and weights, whereas the function needs to be passed embeddings as well. Following PyTorch's naming conventions, I called the op embeddingbag and the layer EmbeddingBag, but this is almost certainly not what you want. What is the right way to name these two? Should I make the function/op a stateless Layer rather than just a function?
No support for float16/bfloat16 yet.
Because context->AllocateTemp continuously segfaulted for me when I was compiling in the custom-op repo, I used AllocateOutput to make some dummy outputs and then just used them as temp arrays. Compiling in tensorflow_addons itself seems much more stable, but I still need to go back and set that properly to AllocateTemp.
The CUDA/C++ ops expect a weight tensor. When no weights are passed, the Python wrapper instantiates dummy weights with tf.ones_like(). Is this acceptable?
More tests! I don't have any gradient tests at all yet, and I should probably add additional tests with weird shapes.

google-cla · 2021-01-18T17:19:33Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.

What to do if you already signed the CLA

Individual signers

It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.

Corporate signers

Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
The email used to register you as an authorized contributor must also be attached to your GitHub account.

ℹ️ Googlers: Go here for more info.

Rocketknight1 · 2021-01-18T17:55:26Z

@googlebot I signed it!

bhack · 2021-01-18T20:09:41Z

Can you add the python compositional impl as https://github.com/tensorflow/addons/blob/master/tensorflow_addons/seq2seq/beam_search_decoder.py#L237 ?

bhack · 2021-01-18T20:15:04Z

Here some other examples about custom + python ops alternative impl (now python only on master) #1114 (comment)

Rocketknight1 · 2021-01-18T21:40:11Z

Sure, I'll add that soon while I'm fixing everything else up.

WindQAQ

Hi @Rocketknight1, thanks for the contribution. This is a huge PR. I have a suggestion: I can help you do C++ CPU impl (with threading and vectorization) and python interface. This can reduce back and forth communication during review. If you can accept it, I'll work on your branch and comment out GPU build (I'll modify style and input/output), or I can file another PR for CPU ops. Thank you!

The CUDA/C++ ops expect a weight tensor. When no weights are passed, the Python wrapper instantiates dummy weights with tf.ones_like(). Is this acceptable.

It is acceptable. TensorFlow op registration mini language does not support optional input. The workaround is

Manipulate some inputs in python (as what you do).
Or pass input as a list of Tensor. In this way, you can check the length of input to identity if Tensors present or not.

tensorflow_addons/layers/embeddingbag.py

tanguycdls · 2021-01-19T18:39:57Z

Hi @Rocketknight1 thanks for your PR. We recently switched from Torch to TF and Embedding Bag was missing for us too!

In our use case we often work with list of non constant len called Ragged tensors in TF which uses a similar data format as Sparse CSR matrix:

https://www.tensorflow.org/guide/ragged_tensor

We have ragged tensors such as :

offsets = [0, 0, 0, 1, 5, 7]
indices = [12, 13, 14, 15, 16, 78, 16]
tf.RaggedTensor.from_row_splits(indices, offsets)

<tf.RaggedTensor [[], [], [12], [13, 14, 15, 16], [78, 16]]>

For now we currently replace embedding bag by converting our ragged to a Sparse Tensor (indices being the rows and y the nbr of item in each row) and values being the indices we want to gather. We also use a second sparse Tensor which will have the weights instead of the indices.
We can then use tf.nn.safe_embedding_lookup_sparse and get better result than a simple gather then reduce. I'm not very clear on the ram usage of that one but it does the embedding lookup on unique indices and then apply a gather on it. (see)

Another workaround we found is to create a sparse tensor Indicator: the x coordinate will be the rows of your batch, the y the indices and values being the weight: you can then consider your embedding + sum as a sparse_dense_matmul between the embeddings matrix and your sparse indicator. The sparse_dense_matmul itself is very fast the issue is more on creating the sparse indicator. I'm not sure how that option behaves on memory since the internals are handled by TF.

I did a few tests here:

https://colab.research.google.com/gist/tanguycdls/9c696097642844fc1e548c0cade48e11/sparseembeddings.ipynb

the performance depends a lot on the sparsity of indices and nbr of items to compute in the ragged case.

I'll try to compile your branch to compare the performance between sparse matmul and your EmbeddingBag ! We did a few months ago a benchmark Pytorch vs TF and embeddingbag was slightly better than matmul in some cases.

Would be happy to help on benchmarks if you need some in that PR !

bhack · 2021-01-19T18:57:55Z

@tanguycdls Thanks. This seems interesting to explore.

Rocketknight1 · 2021-01-19T19:17:03Z

Hi @tanguycdls, your workaround with sparse multiplications is really interesting! I'm also curious to know how it compares in terms of memory usage to the CUDA EmbeddingBag. Please note that my CPU implementations are not very optimized right now (as @WindQAQ pointed out), but the CUDA should be at least reasonably performant.

Rocketknight1 · 2021-01-19T19:27:55Z

Also yes @WindQAQ, if you want to improve the CPU implementations feel free to make those changes. I'm aware that they are currently 'reference' implementations that are not high-performance. Thank you!

bhack · 2021-01-19T19:59:58Z

@tanguycdls It could be interesting to benchmark also against TF-nightly as in the Compiler/Stack some optimizations on sparse are really on the development edge. See https://llvm.discourse.group/t/mlir-support-for-sparse-tensors/

bhack · 2021-01-19T20:01:46Z

/cc @aartbik if he is interested on the sparse code-path.

google-cla · 2021-01-19T23:19:13Z

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

Rocketknight1 · 2021-05-07T13:31:59Z

Hi @bhack can you approve running workflows? It's awkward for me to run all the tests locally and it's really helpful if I can quickly check via CI!

Rocketknight1 · 2021-05-09T18:06:07Z

I have run into a problem - the code added by @WindQAQ to convert the parameter gradients to an IndexedSlices object does not seem to work in graph mode. I don't think this is a problem with the code - it looks good to me! However, the IndexedSlices tensor is returned as a dense tensor when I wrap _embedding_bag with tf.function().

I googled around and couldn't find a cause for this, but there's a suggestive issue here - it's possible this is because of a different TF bug that got monkey-patched: tensorflow/tensorflow#36236

Either way, I've edited the tests and hopefully it should all work now.

fsx950223 · 2021-05-10T01:43:44Z

tensorflow_addons/layers/embedding_bag.py

+    sorted_unique_indices = tf.sort(unique_indices)
+    return [
+        None,
+        tf.IndexedSlices(


Maybe just return value_grads is a better choice, according to https://github.com/foxik/tensorflow/blob/b0e08f68d84476b941e8a821d653ab08129c92ce/tensorflow/python/eager/function.py#L1241.

Good idea, I made the change!

Rocketknight1 · 2021-05-10T10:42:54Z

@bhack I believe this PR should have fixed it! Can you approve running the tests?

bhack · 2021-05-10T10:47:49Z

You have a lint error.

Rocketknight1 · 2021-05-10T11:21:11Z

Fixed! I think.

Rocketknight1 · 2021-05-10T11:47:48Z

We did it!

bhack

Can you add yourself as Codeowner in https://github.com/tensorflow/addons/blob/master/.github/CODEOWNERS ?

Rocketknight1 · 2021-05-14T15:21:28Z

@bhack Ah, sorry, just saw that. I've added myself!

Rocketknight1 · 2021-05-26T12:34:49Z

@bhack Can we run tests? I think this is ready to merge now!

Rocketknight1 · 2021-05-27T16:35:45Z

@bhack I made the requested change (adding myself to CODEOWNERS) so I think this is ready to merge!

bhack · 2021-05-27T16:41:41Z

@fsx950223 Anything else?

Rocketknight1 added 5 commits January 18, 2021 16:44

Adding EmbeddingBag layer and associated updates to BUILD files etc.

32b3e6c

Adding EmbeddingBag layer and associated updates to BUILD files etc.

358d58d

Made sure the Layers know they depend on the new .so file

0f05f51

Bugfixes to test

fa1600a

Bugfixes to test

5365329

boring-cyborg bot added custom-ops layers labels Jan 18, 2021

google-cla bot added the cla: no label Jan 18, 2021

Rocketknight1 mentioned this pull request Jan 18, 2021

EmbeddingBag and Product-Key Memory Layers #2201

Closed

google-cla bot added cla: yes and removed cla: no labels Jan 18, 2021

WindQAQ reviewed Jan 19, 2021

View reviewed changes

Rocketknight1 added 3 commits January 19, 2021 19:38

Fixing initializer arguments

9cc8cb7

Fixing names of input arguments -> "input_dim" and "output_dim"

251a9d1

Fixes to get_config()

7ee13dc

Finish CPU forward part

0b06706

google-cla bot added cla: no and removed cla: yes labels Jan 19, 2021

bhack previously approved these changes May 7, 2021

View reviewed changes

bhack added the kokoro:force-run label May 7, 2021

kokoro-team removed the kokoro:force-run label May 7, 2021

Update test syntax

aedb074

Rocketknight1 dismissed stale reviews from bhack and fsx950223 via aedb074 May 9, 2021 17:56

fsx950223 reviewed May 10, 2021

View reviewed changes

Stop returning IndexedSlices gradients

184f624

Code style pass

6480f67

fsx950223 previously approved these changes May 10, 2021

View reviewed changes

bhack requested changes May 10, 2021

View reviewed changes

Adding myself to CODEOWNERS

3137c3c

Rocketknight1 dismissed fsx950223’s stale review via 3137c3c May 14, 2021 15:21

boring-cyborg bot added the github label May 14, 2021

fsx950223 approved these changes May 28, 2021

View reviewed changes

bhack approved these changes May 28, 2021

View reviewed changes

bhack merged commit 3c662c6 into tensorflow:master May 28, 2021

seanpmorgan mentioned this pull request Jun 14, 2021

[WIP] Test GPU embedding_bag #2501

Closed

Rocketknight1 mentioned this pull request Aug 6, 2021

Deberta tf huggingface/transformers#12972

Merged

5 tasks

bhack mentioned this pull request Aug 19, 2021

EmbeddingBag for ragged indices #2544

Closed

EmbeddingBag op and layer #2352

EmbeddingBag op and layer #2352

Uh oh!

Conversation

Rocketknight1 commented Jan 18, 2021

Description

Type of change

Comments

Uh oh!

google-cla bot commented Jan 18, 2021

What to do if you already signed the CLA

Individual signers

Corporate signers

Uh oh!

Rocketknight1 commented Jan 18, 2021

Uh oh!

bhack commented Jan 18, 2021

Uh oh!

bhack commented Jan 18, 2021

Uh oh!

Rocketknight1 commented Jan 18, 2021

Uh oh!

WindQAQ left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tanguycdls commented Jan 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bhack commented Jan 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 commented Jan 19, 2021

Uh oh!

Rocketknight1 commented Jan 19, 2021

Uh oh!

bhack commented Jan 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bhack commented Jan 19, 2021

Uh oh!

google-cla bot commented Jan 19, 2021

Uh oh!

Rocketknight1 commented May 7, 2021

Uh oh!

Rocketknight1 commented May 9, 2021

Uh oh!

fsx950223 May 10, 2021

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 May 10, 2021

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 commented May 10, 2021

Uh oh!

bhack commented May 10, 2021

Uh oh!

Rocketknight1 commented May 10, 2021

Uh oh!

Rocketknight1 commented May 10, 2021

Uh oh!

bhack left a comment

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 commented May 14, 2021

Uh oh!

Rocketknight1 commented May 26, 2021

Uh oh!

Rocketknight1 commented May 27, 2021

Uh oh!

bhack commented May 27, 2021

Uh oh!

Uh oh!

WindQAQ left a comment •

edited

Loading

tanguycdls commented Jan 19, 2021 •

edited

Loading

bhack commented Jan 19, 2021 •

edited

Loading

bhack commented Jan 19, 2021 •

edited

Loading