meta-llama/Llama-3.1-8B-Instruct generate output shows unexpected padding #1378

aslanxie · 2024-09-29T16:27:37Z

System Info

optimum-habana: v1.13.2
habanalabs-dkms/jammy 1.17.1-40
DOCKER_IMAGE=vault.habana.ai/gaudi-docker/1.17.1/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

clone and install optimum-habana

git clone https://github.com/huggingface/optimum-habana
cd optimum-habana && git checkout v1.13.2
pip install .

move to examples/text-generation and run
python3 run_generation.py --model_name_or_path meta-llama/Llama-3.1-8B-Instruct --use_hpu_graphs --limit_hpu_graph --use_kv_cache --reuse_cache --trim_logits --attn_softmax_bf16 --max_input_tokens 512 --max_new_tokens 2048 --bf16 --batch_size 1 --warmup 0 --n_iterations 3
The output looks like below. The flag '!' is unexpected padding in output:

Input/outputs:
input 1: ('DeepSpeed is a machine learning framework',)
output 1: ('!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!DeepSpeed is a machine learning framework that provides a set of tools and libraries for scaling up deep learning models and training them on large datasets. It is designed to be highly efficient and scalable, allowing users to train large models on a single machine or distribute the training process across multiple machines.\n\nHere are some key features of DeepSpeed:\n\n1.  **Efficient Training**: DeepSpeed provides a set of techniques to optimize the training process, including gradient accumulation, mixed precision training, and model parallelism. These techniques can significantly reduce the training time and memory usage.\n2.  **Distributed Training** ...

Expected behavior

The expected output should be:

Input/outputs:
input 1: ('DeepSpeed is a machine learning framework',)
output 1: ('DeepSpeed is a machine learning framework that provides a set of tools and libraries for scaling up deep learning models and training them on large datasets. It is designed to be highly efficient and scalable ...

The text was updated successfully, but these errors were encountered:

aslanxie · 2024-09-29T16:32:44Z

From llama3, the bos/eos token id are changed, for example Llama-3.1-8B-Instruct:

 "bos_token_id": 128000,
  "eos_token_id": [
    128001,
    128008,
    128009
  ],

In text-generation example, it force model.generation_config.pad_token_id = 0, and token id 0 represents '!' in meta-llama/Llama-3.1-8B-Instruct tokenizer table. So, it looks like token id mismatch.

regisss · 2024-10-20T14:19:38Z

@aslanxie This should have been fixed by #1444 that I just merged into main.
Can you try again on the main branch and let me know if that works on your side too?

aslanxie · 2024-10-23T01:22:05Z

@regisss It's working on v1.14.0 now.

aslanxie added the bug Something isn't working label Sep 29, 2024

regisss mentioned this issue Oct 20, 2024

Fix Llama 3.1 generation #1444

Merged

3 tasks

regisss closed this as completed in #1444 Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meta-llama/Llama-3.1-8B-Instruct generate output shows unexpected padding #1378

meta-llama/Llama-3.1-8B-Instruct generate output shows unexpected padding #1378

aslanxie commented Sep 29, 2024

aslanxie commented Sep 29, 2024

regisss commented Oct 20, 2024

aslanxie commented Oct 23, 2024

meta-llama/Llama-3.1-8B-Instruct generate output shows unexpected padding #1378

meta-llama/Llama-3.1-8B-Instruct generate output shows unexpected padding #1378

Comments

aslanxie commented Sep 29, 2024

System Info

Information

Tasks

Reproduction

Expected behavior

aslanxie commented Sep 29, 2024

regisss commented Oct 20, 2024

aslanxie commented Oct 23, 2024