Added a NTXent loss #897

GrantMcConachie · 2024-04-01T20:30:36Z

An normalized temperature scaled cross entropy (NTXent) loss for a contrastive learning objective. I am fairly new to submitting pull requests to public repos, so I didn't add a ton of tests for this outside a batched test. Let me know if there is anything else I should add!

This is the NTXent loss from the SimClr paper. I have done some minor testing on it and it reproduces both loss and gradient values of pytorch_metric_learning's function NTXentLoss. Let me know if you want me to add a test function as well. I was unsure if I should put it in classification because it is really a self-supervised loss, so let me know if you want it somewhere else!

Got rid of test.sh errors

test for ntxent loss

changed the way cosine_similarity is imported.

fixed the output based on the default temperature scaling.

Made the function jittable

simpler ntxent test. renamed the class to fix typo.

changed == True to == 1

GrantMcConachie · 2024-04-01T20:51:41Z

Oh also, I know NTXent isn't really a classification loss, but I didn't know where else to put it.

vroulet

Thank you @GrantMcConachie!
This would be a great addition. I left you some comments. In addition of those comments:

you may create a file _contrastive.py where you would put this loss (so don't put this in _classification.py),
you will need to add the loss in the __init__.py file of the losses folder,
you will also need to add it in the docs (in docs/api/losses.rst).

Ping me if you have any difficulties and thank you again!

optax/losses/_classification.py

vroulet · 2024-04-04T20:39:21Z

Hello @GrantMcConachie!

Yes, it's a bug due to a new release of jax, it's already been pointed out in #904. I've upstreamed the bug internally to the jax team, I hope it can get solved. I don't see any quick hot fix (the bug shall affect many of our modules). I can ping you when it gets solved :)

vroulet · 2024-04-05T18:59:30Z

The tests have been fixed in #908. You should be able to proceed. Again it would be really nice if you could avoid passing through an exponential without careful tricks like the losumexptrick (see https://en.wikipedia.org/wiki/LogSumExp). Alternatively, you may use directly functions like jax.nn.log_softmax that implemented such a trick.
Thanks again for the PR!

GrantMcConachie · 2024-04-06T15:42:47Z

I am still working on this! It is trickier to implement than I thought due to the partitioning of positive and negative pairs. For example If there is 4 embeddings with labels like this:

e1 -> 0
e2 -> 0
e3 -> 0
e4 -> 1

then your cosine similarity matrix will be a 4x4 matrix like this:

[ 0          sim01+     sim02+     sim03-]
[sim10+        0        sim12+     sim13-]
[sim20+      sim21+       0        sim23-]
[sim30       sim31      sim32        0   ]

Where the pluses indicate positive pairs.

So we want the loss to be log(exp(sim01+) / (exp(sim01+) + exp(sim03-)) + log(exp(sim02+) / (exp(sim02+) + exp(sim03-)) + log(exp(sim10+) / (exp(sim10+) + exp(sim13-)) + .... However a row wise log_softmax on this matrix will return values log(exp(sim01+) / (exp(sim01+) + exp(sim02+) + exp(sim03-)) and will incorporate exp(sim02+) in the denominator when it shouldn't.

I think there's probably a way around this using the logsumexp or log_softmax, but I am still trying to figure out how to do it.

…lculation

GrantMcConachie · 2024-04-13T00:22:49Z

Hi @vroulet! I believe that I implemented the loss function using the same trick that logsumexp and log_softmax. I did not use these functions explicitly, but I was able to "normalize" the cosine similarity values by subtracting the row wise maximum cosine similarity values from each cosine similarity value before exponentiating, summing, then taking the logarithm. I believe this is sufficient to avoid overflow/underflow problems, but please let me know if you see an issue with this!

vroulet

Perfect! Thank you for pushing this through!
I'm leaving last small comments and we'll merge :)
Also you'll need to add the loss to the main __init__.py file (otherwise it won't be caught in the doc; for an example see definition of e.g. convex_kl_divergence in __init__.py and its appearance in the definition of __all__ in the __init__.py file).

optax/losses/_self_supervised_test.py

optax/losses/_self_supervised.py

docs/api/losses.rst

optax/losses/_self_supervised_test.py

GrantMcConachie · 2024-04-13T02:12:50Z

Thanks for all the edits! I am happy that this will get added. Please let me know if there's more I can do!

vroulet

Thank you again @GrantMcConachie !

GrantMcConachie added 11 commits March 26, 2024 17:06

Merge branch 'google-deepmind:main' into patch-1

a8f9816

Update _classification.py

b2179c3

Got rid of test.sh errors

Update _classification_test.py

f6599b8

test for ntxent loss

Merge branch 'google-deepmind:main' into patch-1

9e1908a

Update _classification.py

271ccb0

changed the way cosine_similarity is imported.

Update _classification_test.py

6ae70bf

fixed the output based on the default temperature scaling.

Merge branch 'google-deepmind:main' into patch-1

b5dbb93

Update _classification.py

cc79720

Made the function jittable

Update _classification_test.py

4d4368a

simpler ntxent test. renamed the class to fix typo.

Update _classification.py

28f86eb

changed == True to == 1

vroulet reviewed Apr 2, 2024

View reviewed changes

GrantMcConachie and others added 9 commits April 2, 2024 10:22

Merge branch 'google-deepmind:main' into patch-1

45fbefa

added contrastive loss specific scripts

2900b7f

moved ntxent loss to contrastive specific script and added to __init__

d0a4ce1

added whitespace

8ee73e7

changed the file names and fixed the docstrings for the functions

39928cc

added ntxent to losses.rst and changed import in __init__

4f05cc2

typo fixed

1d8e47f

changed typehints

35f3347

changed back type hints

284060d

GrantMcConachie and others added 4 commits April 12, 2024 13:27

Merge branch 'google-deepmind:main' into patch-1

4c2d1b3

another test case that is important to have

a651bcb

changed the type hints and added a more robust numerical stability ca…

38afc10

…lculation

took out trailing whitespace

f66718c

vroulet reviewed Apr 13, 2024

View reviewed changes

minor changes to comments, docstrings, placements

1cef8b5

vroulet approved these changes Apr 13, 2024

View reviewed changes

copybara-service bot merged commit 60bb59f into google-deepmind:main Apr 15, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a NTXent loss #897

Added a NTXent loss #897

GrantMcConachie commented Apr 1, 2024

GrantMcConachie commented Apr 1, 2024

vroulet left a comment

vroulet commented Apr 4, 2024

vroulet commented Apr 5, 2024

GrantMcConachie commented Apr 6, 2024

GrantMcConachie commented Apr 13, 2024

vroulet left a comment

GrantMcConachie commented Apr 13, 2024

vroulet left a comment

Added a NTXent loss #897

Added a NTXent loss #897

Conversation

GrantMcConachie commented Apr 1, 2024

GrantMcConachie commented Apr 1, 2024

vroulet left a comment

Choose a reason for hiding this comment

vroulet commented Apr 4, 2024

vroulet commented Apr 5, 2024

GrantMcConachie commented Apr 6, 2024

GrantMcConachie commented Apr 13, 2024

vroulet left a comment

Choose a reason for hiding this comment

GrantMcConachie commented Apr 13, 2024

vroulet left a comment

Choose a reason for hiding this comment