Hi @happierpig, I noticed your PR fixes integer overflow in FA2 #1290.
However in my test with FA3, when I set seqlen=49152, kv_head=32, block_size=128, num_block=seqlen // block_size=384, backend='fa3', which means an uniform block sparse case, it easily causes the buffer size error.
|
raise ValueError( |
|
"_vector_sparse_indices_buffer is not large enough. Please increase the buffer size." |
|
) |
Could you have a robust way to deal with larger number of blocks when setting
_vector_sparse_indices_buffer? Thanks!🌻