Can not run the repo #3

gpzlx1 · 2024-09-29T05:44:29Z

My script is as follows:

#!/bin/bash

CUDA_VISIBLE_DEVICES=0 python run_longbench.py --model_name "/nfs/shared_LLM_model/lmsys/longchat-7b-v1.5-32k" \
    --dtype "float16"   \
    --key_quantization_bits 256   \
    --key_quantization_bits_initial_layers 512   \
    --initial_layers_count 15   \
    --outlier_count_general 8   \
    --outlier_count_initial_layers 8   \
    --value_quantization_bits 2   \
    --group_size 32   \
    --buffer_size 128   \
    --seed 42   \
    --dataset_name lcc   \
    --n_data 150

However, it raises the following error:

Traceback (most recent call last):
  File "run_longbench.py", line 229, in <module>
    main(args)
  File "run_longbench.py", line 216, in main
    evaluate_model(
  File "run_longbench.py", line 168, in evaluate_model
    output = model_qjl.generate(
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/transformers/generation/utils.py", line 1718, in generate
    return self.greedy_search(
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/transformers/generation/utils.py", line 2579, in greedy_search
    outputs = self(
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/workspace/QJL/models/llama2_qjl.py", line 660, in forward
    outputs = self.model(
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/workspace/QJL/models/llama2_qjl.py", line 552, in forward
    layer_outputs = decoder_layer(
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/workspace/QJL/models/llama2_qjl.py", line 418, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/workspace/QJL/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/workspace/QJL/models/llama2_qjl.py", line 157, in forward
    att_qk = kv_quant.attention_score(query_states)
  File "/root/workspace/QJL/models/llama2_utils_qjl.py", line 175, in attention_score
    self.key_states_quant,
AttributeError: 'QJLKeyQuantizer' object has no attribute 'key_states_quant'

Additionally, I have encountered several import errors within this repository. I hope these issues can be resolved soon.

Best wishes to you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not run the repo #3

Can not run the repo #3

gpzlx1 commented Sep 29, 2024

Can not run the repo #3

Can not run the repo #3

Comments

gpzlx1 commented Sep 29, 2024