Fix lora requests when dp with vllm #2433

ckgresla · 2024-10-28T22:58:34Z

Following some investigation, I believe there is a fix to be made inside the VLLM class's _model_generate() method. I stumbled across an inconsistency when trying to evaluate some LoRA adapters with vLLM as the backend in lm_eval, see this issue as a reference.

Specifically, there the evaluation results were exactly the same (greedy decoding) for a trained LoRA adapter and its base model. When we merged the adapter into the model, we got quite different results. After snooping a bit around I believe the issue has to do with how requests are sent to the vLLM model at test time, when we set data_parallel_size>1 and lora_request the prior behavior would have issued regular base model requests, without using the adapter.

This PR provides a solution for that issue, and routes requests through the LoRA adapter even when using data parallelism.

CLAassistant · 2024-10-28T22:58:40Z

All committers have signed the CLA.

baberabb · 2024-10-29T16:21:31Z

Thanks v. much for the PR. LGTM, if you could the pre-commit to pass the formatting CI tests:

pip install pre-commit
pre-commit install
pre-commit run --all-files

ckgresla · 2024-10-30T00:25:34Z

Thank you for the helpful commands! Linted and ready for CI. @baberabb

Chris Kerwell Gresla and others added 2 commits October 28, 2024 15:40

fix: use lora_request for data parallel vllm evals

fd28e95

fix(docs): include type hint

25bd8ae

ckgresla requested review from haileyschoelkopf, lintangsutawika and baberabb as code owners October 28, 2024 22:58

chore: lint, et pre-commit al

c2c9a04

baberabb approved these changes Oct 30, 2024

View reviewed changes

baberabb merged commit 838a3e0 into EleutherAI:main Oct 30, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix lora requests when dp with vllm #2433

Fix lora requests when dp with vllm #2433

ckgresla commented Oct 28, 2024

CLAassistant commented Oct 28, 2024 •

edited

Loading

baberabb commented Oct 29, 2024

ckgresla commented Oct 30, 2024

Fix lora requests when dp with vllm #2433

Fix lora requests when dp with vllm #2433

Conversation

ckgresla commented Oct 28, 2024

CLAassistant commented Oct 28, 2024 • edited Loading

baberabb commented Oct 29, 2024

ckgresla commented Oct 30, 2024

CLAassistant commented Oct 28, 2024 •

edited

Loading