-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling Snowflake Arctic on Gaudi 3 #1719
base: main
Are you sure you want to change the base?
Conversation
3065354
to
02ca4de
Compare
@pi314ever
Thanks Tests
Both above, as of here, crashes with
|
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
This reverts commit 9c390e7.
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
02ca4de
to
5d63631
Compare
@imangohari1 this depends on a patched version of deepspeed located here: https://github.com/pi314ever/DeepSpeed/tree/arctic-enabling-1.19 Test command: python ../gaudi_spawn.py --use_deepspeed --world_size 8 run_generation.py --model_name_or_path Snowflake/snowflake-arctic-instruct --bf16 --use_kv_cache --max_new_tokens 128 --batch_size 1 Note Graph mode is currently not enabled yet due to memory issues during graph compilation. Specifically, the steps to reproducing are:
Expected performance results on Gaudi 3:
|
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
@pi314ever Few follow ups:
|
I updated the command to install custom DS, forgot a I have noticed the performance to vary quite a bit, but I'm not entirely sure what the reason for it is. I am suspecting node configuration/firmware version but it is hard to know. I tested |
@imangohari1 Running it again with Correction: This was for batch size 2 output 128, not batch size 1 output 256. The result is in the ballpark with my table above. |
@pi314ever @regisss WDYT? |
@pi314ever Do you know if this PR will be compatible with the version of DeepSpeed that will be released with Synapse 1.20 ? |
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
@regisss This PR should be compatible with Synapse 1.20. |
@pi314ever FYI @regisss |
@regisss We need to push this PR out for 1.21 or later. |
What does this PR do?
This PR enables snowflake-arctic-instruct on a single node in Gaudi 3. A single Gaudi 2 node with 8 cards does not have enough memory to load the whole model, so only Gaudi 3 was validated. Graph mode is not enabled yet due to memory issues as well.
This depends on synchronizing the Habana DeepSpeed fork to include deepspeedai/DeepSpeed#6856, which can be found in my branch here: https://github.com/pi314ever/DeepSpeed/tree/arctic-enabling-1.19.
Validated configurations:
Before submitting