Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding what thread_reverse_data[i].x in "selective_scan_bwd_kernel.cuh" file contains #598

Open
SudhanshuBokade opened this issue Oct 19, 2024 · 3 comments

Comments

@SudhanshuBokade
Copy link

SudhanshuBokade commented Oct 19, 2024

At line no.285 , it is given dx = thread_reverse_data[i].y , but according to my calculation of gradient of x it should be dx = dout . C should be there , also it seems to be as in code

const float dx = thread_reverse_data[i].y;

but according to my calculation of gradients dx = dout . C should be there , also it seems to be because it is given in file at line no. 260-263

https://github.com/state-spaces/mamba/blob/bc84fb1172e6dea04a7dc402118ed19985349e95/csrc/selective_scan/selective_scan_bwd_kernel.cuh#L260C8-L264C8

but there is reverse_scan Operation on thread_reverse_data after that , so Does thread_reverse_data after the Reverse_Scan op contains dy .C ?

Thank you very much for your help

@tridao
Copy link
Collaborator

tridao commented Oct 21, 2024

dx in the code might actually be gradient wrt to the hidden states, so maybe dh is a better var name.
At some point in the process of writing up the paper we changed the notation.

@SudhanshuBokade
Copy link
Author

Sorry, I was not clear. I understand that dx is gradient wrt hidden states. My question is whether thread_reverse_data[I].y contains dy * C after the Reverse Scan op? It is initialized to dy * C in these lines:

thread_reverse_data[i].y = dout_vals[i] *
(!kIsVariableC
? (!kIsVariableB ? B_val * C_val : C_val)
: (!kIsVariableB ? B_val * C_vals[i] : C_vals[i]));

@SudhanshuBokade
Copy link
Author

SudhanshuBokade commented Oct 23, 2024

Thanks for the help @tridao

Can you please also take a look at this question?

#598 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants