You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
move to examples/text-generation and run python3 run_generation.py --model_name_or_path meta-llama/Llama-3.1-8B-Instruct --use_hpu_graphs --limit_hpu_graph --use_kv_cache --reuse_cache --trim_logits --attn_softmax_bf16 --max_input_tokens 512 --max_new_tokens 2048 --bf16 --batch_size 1 --warmup 0 --n_iterations 3
The output looks like below. The flag '!' is unexpected padding in output:
Input/outputs:
input 1: ('DeepSpeed is a machine learning framework',)
output 1: ('!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!DeepSpeed is a machine learning framework that provides a set of tools and libraries for scaling up deep learning models and training them on large datasets. It is designed to be highly efficient and scalable, allowing users to train large models on a single machine or distribute the training process across multiple machines.\n\nHere are some key features of DeepSpeed:\n\n1. **Efficient Training**: DeepSpeed provides a set of techniques to optimize the training process, including gradient accumulation, mixed precision training, and model parallelism. These techniques can significantly reduce the training time and memory usage.\n2. **Distributed Training** ...
Expected behavior
The expected output should be:
Input/outputs:
input 1: ('DeepSpeed is a machine learning framework',)
output 1: ('DeepSpeed is a machine learning framework that provides a set of tools and libraries for scaling up deep learning models and training them on large datasets. It is designed to be highly efficient and scalable ...
The text was updated successfully, but these errors were encountered:
In text-generation example, it force model.generation_config.pad_token_id = 0, and token id 0 represents '!' in meta-llama/Llama-3.1-8B-Instruct tokenizer table. So, it looks like token id mismatch.
@aslanxie This should have been fixed by #1444 that I just merged into main.
Can you try again on the main branch and let me know if that works on your side too?
System Info
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
move to
examples/text-generation
and runpython3 run_generation.py --model_name_or_path meta-llama/Llama-3.1-8B-Instruct --use_hpu_graphs --limit_hpu_graph --use_kv_cache --reuse_cache --trim_logits --attn_softmax_bf16 --max_input_tokens 512 --max_new_tokens 2048 --bf16 --batch_size 1 --warmup 0 --n_iterations 3
The output looks like below. The flag '!' is unexpected padding in output:
Expected behavior
The expected output should be:
The text was updated successfully, but these errors were encountered: