Skip to content

Conversation

@hugolatendresse
Copy link
Contributor

@hugolatendresse hugolatendresse commented Mar 8, 2025

For large tensors with non-last dimension softmax, we transpose to move the softmax dimension to the end, apply softmax, and then transpose back to the original shape.

Since the issue here that using softmax on non-last dimension could cause python/tvm/dlight/gpu/general_reduction.py to create arrays that are too big for the GPU shared memory, I tried to address this TODO by making changes to general_reduction.py, without success. However, as I was experimenting, I added a suggested handling for the case where num_leading_s = 0 in general_reduction.py. I thought I might as well leave that in the PR.

cc: @MasterJH5574

Edit: we may not merge this at all because it's better to fix the reduction directly, and the fix in this PR may simply be extra overhead

@hugolatendresse hugolatendresse marked this pull request as ready for review March 8, 2025 21:21
@hugolatendresse hugolatendresse changed the title Allow softmax to work on a large tensor when dimension is not the last one [Relax] Allow softmax to work on a large tensor when dimension is not the last one Mar 8, 2025
@hugolatendresse hugolatendresse marked this pull request as draft March 10, 2025 19:04
@tqchen
Copy link
Member

tqchen commented Mar 12, 2025

let us instead to work and allow dlight to work correctly for non-last dimension cases

@hugolatendresse
Copy link
Contributor Author

let us instead to work and allow dlight to work correctly for non-last dimension cases

Sounds good, closing the PR

@MasterJH5574
Copy link
Contributor

Fixed in #17754

@hugolatendresse hugolatendresse deleted the fix_softmax_not_last_dim branch May 4, 2025 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants