refactor TurbomindModelConfig #2364

lvhan028 · 2024-08-23T10:55:19Z

No description provided.

lmdeploy/messages.py

irexyc · 2024-09-02T05:41:28Z

lmdeploy/turbomind/deploy/converter.py

+        group_size = _group_size
+


Shoud we assert _group_size ==128?

Yes.

# Compatible to awq models that are quantized by lmdeploy (<=v0.3.0) if not group_size: group_size = 128 if engine_config.model_format in ['awq', 'gptq']: assert group_size == 128, \ f'model format is "{engine_config.model_format}" ' \ f'but group_size is {group_size}. Currently, only 128 ' \ 'is supported'

irexyc · 2024-09-02T06:12:31Z

test /mnt/140/InternLM/internlm2_5-7b-chat convert, load from workspace and load from hf.

examples/cpp/llama/llama_config.yaml

src/turbomind/triton_backend/llama/LlamaTritonModel.cc

lvhan028 and others added 29 commits August 23, 2024 18:52

use yaml config instead of ini config

8227b67

remove comments

15a1d3e

update

c453405

remove unused parameters in TurbomindModelConfig

16db754

fix test error

0bb36f8

fetch yaml-cpp instead of find_package(yaml-cpp)

0634b97

remove INIReader and chagne config to yaml format

a2a494d

update converter

e8ba1a4

remove use_logn_attn from TurbomindEngineConfig since it is rarelly used

bbccfe6

update

97e9231

update

e5cf353

merge main

cfc6cc7

merge main

d4e486d

not save engine_config to yaml

9443186

fix _from_workspace

f30eed1

fix ut

8701b12

fix _from_hf

3f148b2

_postprocess_config

2ba8a64

fix awq model inference

1eeed8c

update

2446cab

update

8ebeca5

fix chat

534e9f1

fix lint

4dee78d

fix lint

47f09ed

config_to_dict

a0c4441

fix lint

d98f221

minor

29b031d

minor

6101874

remove allow_none

cbf238d

lvhan028 requested a review from lzhangzz August 28, 2024 08:44

lvhan028 requested a review from irexyc August 28, 2024 08:44

lvhan028 added the improvement label Aug 28, 2024

lvhan028 added 2 commits August 28, 2024 16:58

minor

6aea61c

fix lint

71454e2

lvhan028 changed the title ~~Replace config.ini by config.yaml~~ refactor TurbomindModelConfig Aug 28, 2024

irexyc reviewed Sep 2, 2024

View reviewed changes

lvhan028 added 2 commits September 2, 2024 14:23

Merge branch 'main' into yaml-config

f459d25

typo

a935db4

lzhangzz reviewed Sep 2, 2024

View reviewed changes

examples/cpp/llama/llama_config.yaml Outdated Show resolved Hide resolved

src/turbomind/triton_backend/llama/LlamaTritonModel.cc Outdated Show resolved Hide resolved

src/turbomind/triton_backend/llama/LlamaTritonModel.cc Outdated Show resolved Hide resolved

fix

557749c

lzhangzz approved these changes Sep 2, 2024

View reviewed changes

irexyc approved these changes Sep 2, 2024

View reviewed changes

lvhan028 merged commit f4ee599 into InternLM:main Sep 2, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor TurbomindModelConfig #2364

refactor TurbomindModelConfig #2364

lvhan028 commented Aug 23, 2024

irexyc Sep 2, 2024 •

edited

Loading

lvhan028 Sep 2, 2024

irexyc commented Sep 2, 2024

refactor TurbomindModelConfig #2364

refactor TurbomindModelConfig #2364

Conversation

lvhan028 commented Aug 23, 2024

irexyc Sep 2, 2024 • edited Loading

Choose a reason for hiding this comment

lvhan028 Sep 2, 2024

Choose a reason for hiding this comment

irexyc commented Sep 2, 2024

irexyc Sep 2, 2024 •

edited

Loading