Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase index_copy fix #62

Open
wants to merge 4 commits into
base: optimize_spmd_sharding
Choose a base branch
from

Conversation

wonjoolee95
Copy link
Collaborator

Rebase @bhavya01's index_copy fix from llama2-google-next-inference branch

@JackCaoG
Copy link

@wonjoolee95 can you do a run and see if this change has any performance implication?

@wonjoolee95
Copy link
Collaborator Author

This rebase seems to give me the buffer with shape bf16[1,2048,32,128] on device SPMD:0 is null error (full paste: https://gist.github.com/wonjoolee95/73b1590e9432eabe39697708f8b2da71).

And when we change to openxla (instead of openxla_eval), the performance seems significantly worse.

We can tackle the SPMD inference work and this issue separately. For now, we can either keep using the existing out-of-place .index_copy and manually set the dynamo_cache_hit or use continuing using the 05012024 wheels.

cc @bhavya01

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants