-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sharktank] Add Perplexity pre-submit test #579
Conversation
@@ -74,4 +74,4 @@ jobs: | |||
iree-base-runtime | |||
|
|||
- name: Run perplexity test with vmfb | |||
run: pytest -n 8 -v -s sharktank/tests/evaluate/perplexity_iree_test.py --run-quick-llama-test --bs=5 --iree-device='hip://6' --iree-hip-target=gfx942 --iree-hal-target-backends=rocm --llama3-8b-f16-model-path=/data/llama3.1/8b/llama8b_f16.irpa --llama3-8b-tokenizer-path=/data/llama3.1/8b/tokenizer_config.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we are targeting only device hip://6? Should we have the runners available decide the device used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, but the specific iree_device
needs to be passed to the vmfbRunner
. We might require another script/flag to determine a free device and pass that info dynamically. Let me know if it's worth looking into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saienduri have we setup ways to target specific devices on runners outside hardcoding like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we do. We can specify ROCR_VISIBLE devices so that runners initialize with only certain gpus to avoid conflicts. I can make sure the seperation is in place to avoid conflicts on Monday.
Add a perplexity pre-submit test for llama3.1 8b fp16 with 5 prompts