Add `torch.logcumsumexp` #36308

kshitij12345 · 2020-04-09T11:30:33Z

Creating new PR as I am unable to push to @pandeykartikey 's branch as I don't have the permissions.

Closes #26411

Based on #32876 Thanks @pandeykartikey for starting this out.

Have addressed the comments.

@anjali411 @agadetsky @albanD

…logcumsumexp

dr-ci · 2020-04-09T11:31:32Z

💊 CI failures summary and remediations

As of commit c0ee7ab (more details on the Dr. CI page):

1/3 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)
1/3 tentatively recognized as flaky ❄️
- Click here to rerun these jobs
1/3 broken upstream at merge base f3b5c22 since May 19

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

pytorch_libtorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (1/1)

Step: "Set Up CI Environment After attach_workspace" (full log | diagnosis details | 🔁 rerun) ❄️

E: Failed to fetch https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64/Packages 404 Not Found

                            100% [Working]                Fetched 5,129 kB in 3s (1,652 kB/s) 
Reading package lists... 99%  Reading package lists... Done  
W: The repository 'https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64  Release' does not have a Release file. 
N: Data from such a repository can't be authenticated and is therefore potentially dangerous to use. 
N: See apt-secure(8) manpage for repository creation and user configuration details. 
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://cli-assets.heroku.com/apt ./ InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 5DC22404A6F9F1CA 
W: The repository 'https://packagecloud.io/circleci/trusty/ubuntu xenial Release' does not have a Release file. 
N: Data from such a repository can't be authenticated and is therefore potentially dangerous to use. 
N: See apt-secure(8) manpage for repository creation and user configuration details. 
W: Failed to fetch https://cli-assets.heroku.com/apt/./InRelease  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 5DC22404A6F9F1CA 
E: Failed to fetch https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64/Packages  404  Not Found 
W: Some index files failed to download. They have been ignored, or old ones used instead.

🚧 1 ongoing upstream failure:

These were probably caused by upstream breakages that are not fixed yet:

pytorch_macos_10_13_py3_test since May 19
- 🔁 rerun

ci.pytorch.org: 1 failed

Failed: pr/py3.6-clang7-rocmdeb-ubuntu16.04

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 100 times.

log-sum-exp trick doesn't seem to be working. The gradient check doesn't pass with log-sum-exp.

tridao · 2020-04-10T03:05:27Z

I don't think the current implementation is numerically stable. The original issue #26411 suggests using cummax to reduce numerical error, but that has quadratic complexity (see discussion in #32876).
I still think it's better to implement it the way Tensorflow does it: as a prefix scan of the binary operation log_add_exp.

kshitij12345 · 2020-04-10T19:57:45Z

@anjali411 @albanD Could you please review the PR. Also I don't have the rights to re-run the failed pipeline (whose failure is unrelated to the PR).

@tridao I get the point . But I am not sure that I am quite familiar with the codebase. I saw the reference for cumsum https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/ReduceOpsKernel.cpp
and https://github.com/pytorch/pytorch/blob/master/aten/src/THC/generic/THCTensorMathScan.cu ( CUDA Implementation which I don't think should be used as reference).
Would be great if I can get some pointers.

tridao · 2020-04-10T20:30:24Z

Agreed, those seem to be where cumsum is implemented.
I imagine one just needs to replace the addition (x + y) of cumsum with the operation log_add_exp(x, y)(implemented in a stable way with max and min, e.g. https://www.tensorflow.org/api_docs/python/tf/math/cumulative_logsumexp).

Re: CUDA implementation in THC: maybe cumsum will eventually be ported from THC to Aten? I also found cummax implementation in Aten (https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/ReduceOpsKernel.cu). One would replace max(x, y) with log_add_exp. The cummax implementation also needs the indices but I don't think logcumsumexp will need those.

…logcumsumexp

aten/src/ATen/native/cuda/ScanKernels.cu

* Add TODO about code duplication. * Fix a comment.

tools/autograd/templates/Functions.cpp

…logcumsumexp

kshitij12345 · 2020-05-19T17:51:12Z

@ngimel @albanD Gentle ping :).

albanD

Looks good to me.
The perf for the backward might not be great but it is better than nothing.
Feel free to open a new issue if you think the backward should be implemented with a special kernel to make it more efficient. But I think this should be left for a future PR anyway.

facebook-github-bot

@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

kshitij12345 · 2020-05-20T20:29:43Z

@albanD
Would be great if this lands soon into master as it would potentially unblock #36458 .
Thank You.

albanD · 2020-05-20T20:31:32Z

The landing is in progress. But there are some flaky internal tests so I had to re-run these...

facebook-github-bot · 2020-05-21T18:16:36Z

@albanD merged this pull request in 3487744.

agadetsky · 2020-05-21T18:22:55Z

@kshitij12345 thank you very much, great work!

agadetsky · 2020-05-21T18:56:25Z

@kshitij12345 seems like documentation is a bit broken. At least computation formula does not appear at the moment at the master doc (https://pytorch.org/docs/master/generated/torch.logcumsumexp.html#torch.logcumsumexp)

kshitij12345 · 2020-05-21T19:00:04Z

@agadetsky Thanks for notifying. Will try to get it fixed.

Summary: References: #24521 #24522 #24547 #24548 #24507 Depends on #36308 Changes related to this PR are only in file : aten/src/ATen/Declarations.cwrap aten/src/ATen/native/cuda/ReduceOpsKernel.cu aten/src/ATen/native/native_functions.yaml aten/src/THC/generic/THCTensorMathScan.cu aten/src/THC/generic/THCTensorMathScan.h Please Review VitalyFedyunin Thanks. Pull Request resolved: #36458 Differential Revision: D21718384 Pulled By: ngimel fbshipit-source-id: 5af15164050c77be164397abd659a48c9ded2b29

Summary: Reference : #36308 (comment) After fix: ![Screenshot from 2020-05-23 15-35-09](https://user-images.githubusercontent.com/19503980/82727956-4bcabb80-9d0b-11ea-85a8-81b35012abbc.png) Pull Request resolved: #38952 Differential Revision: D21722196 Pulled By: ezyang fbshipit-source-id: 62b08c14e0ce9603133841940627df40d7b1e861

pandeykartikey and others added 14 commits February 1, 2020 02:17

Adds cumlogsumexp

cb6368b

Adds backprop and tests

bcecdd9

Fixes cumlogsumexp test

91f2b8d

Makes test for cumlogsumexp more extensive

fc2b508

Fixes autograd function

e3fe7b9

Rename cumlogsumexp to logcumsumexp

4cc31d6

Merge branch 'master' of https://github.com/pytorch/pytorch into add-…

a2de6c5

…logcumsumexp

use inplace variants

5cad39a

update gradient compute

965edad

update test_torch

2b290f5

update test_namedtensor

f7d4761

fix indentation

d3bee4f

address comments: fix doc formula

a843b80

add autograd test

503cfb4

pytorchbot added the open source label Apr 9, 2020

kshitij12345 added 4 commits April 9, 2020 17:07

make flake8 happy

96f0230

update grad computation

5c503a8

update forward compute

1d7802d

log-sum-exp trick doesn't seem to be working. The gradient check doesn't pass with log-sum-exp.

update common method invocation args

1ba77ec

zhangguanheng66 requested a review from albanD April 9, 2020 21:25

zhangguanheng66 added module: internals Related to internal abstractions in c10 and ATen triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 9, 2020

kshitij12345 added 2 commits April 10, 2020 03:25

update python definition logcumsumexp

2ff530e

update tensor_op_tests args

b94cf32

add entry to _overrides

005e688

Merge branch 'master' of https://github.com/pytorch/pytorch into add-…

2a50da8

…logcumsumexp

albanD reviewed May 12, 2020

View reviewed changes

aten/src/ATen/native/cuda/ScanKernels.cu Show resolved Hide resolved

kshitij12345 added 2 commits May 13, 2020 16:52

address comments

62d605b

* Add TODO about code duplication. * Fix a comment.

add stable gradient computation and relevant test

00642a2

kshitij12345 commented May 15, 2020

View reviewed changes

tools/autograd/templates/Functions.cpp Show resolved Hide resolved

kshitij12345 added 3 commits May 15, 2020 22:40

Merge branch 'master' of https://github.com/pytorch/pytorch into add-…

8c1c794

…logcumsumexp

fix incorrect tensor init

f82b050

fix incorrect argument

977dcce

agadetsky mentioned this pull request May 18, 2020

Add Plackett-Luce distribution #38684

Open

Merge branch latest 'master' into add-logcumsumexp

c0ee7ab

albanD approved these changes May 19, 2020

View reviewed changes

facebook-github-bot reviewed May 19, 2020

View reviewed changes

facebook-github-bot closed this in 3487744 May 21, 2020

facebook-github-bot added the merged label May 21, 2020

kshitij12345 deleted the add-logcumsumexp branch May 21, 2020 18:58

kshitij12345 mentioned this pull request May 23, 2020

[docs] fix formula torch.logcumsumexp #38952

Closed

mruberry added the Merged label Oct 28, 2020

Add torch.logcumsumexp #36308

Add torch.logcumsumexp #36308

Uh oh!

Conversation

kshitij12345 commented Apr 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Apr 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

❄️ 1 failure tentatively classified as flaky

pytorch_libtorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (1/1)

🚧 1 ongoing upstream failure:

ci.pytorch.org: 1 failed

Uh oh!

tridao commented Apr 10, 2020

Uh oh!

kshitij12345 commented Apr 10, 2020

Uh oh!

tridao commented Apr 10, 2020

Uh oh!

Uh oh!

Uh oh!

kshitij12345 commented May 19, 2020

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

kshitij12345 commented May 20, 2020

Uh oh!

albanD commented May 20, 2020

Uh oh!

facebook-github-bot commented May 21, 2020

Uh oh!

agadetsky commented May 21, 2020

Uh oh!

agadetsky commented May 21, 2020

Uh oh!

kshitij12345 commented May 21, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Add `torch.logcumsumexp` #36308

Add `torch.logcumsumexp` #36308

kshitij12345 commented Apr 9, 2020 •

edited

Loading

dr-ci bot commented Apr 9, 2020 •

edited

Loading