We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No description provided.
The text was updated successfully, but these errors were encountered:
问题定位到是使用int64_t作为索引计算,其中涉及到大量的除法取余,解决方案是dispatch,根据elem_cnt来分发到int32/int64的分支
Sorry, something went wrong.
感觉大部分kernel都不会用到int64_t索引?往往都是CUDA_1D_KERNEL_LOOP里的int32_t的索引来做一系列推导
嗯没特殊情况就直接int32吧
Successfully merging a pull request may close this issue.
No description provided.
The text was updated successfully, but these errors were encountered: