Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge vLLM deployer project to llm-finetuning #163

Merged
merged 20 commits into from
Feb 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ def deployment_deploy() -> (
model_deployer = zenml_client.active_stack.model_deployer
databricks_deployment_config = DatabricksDeploymentConfig(
model_name=model.name,
model_version=model.run_metadata["model_registry_version"].value,
model_version=model.run_metadata["model_registry_version"],
workload_size="Small",
workload_type="CPU",
scale_to_zero_enabled=True,
Expand Down
12 changes: 6 additions & 6 deletions llm-finetuning/README.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wjayesh I think the instructions need still a bit more work. Specifically for stack setup. Even if you're running locally in Docker you need a stack with Wandb experiment tracker, it seems..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm weird. i had run it on a docker orchestrator without wandb in my stack.

Original file line number Diff line number Diff line change
Expand Up @@ -69,16 +69,16 @@ The three pipelines can be run using the CLI:

```shell
# Data generation
python run.py --feature-engineering --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
python run.py --feature-engineering --config generate_code_dataset.yaml
python run.py --feature-pipeline --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
python run.py --feature-pipeline --config generate_code_dataset.yaml

# Training
python run.py --training-pipeline --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
python run.py --training-pipeline --config finetune_gcp.yaml

# Deployment
python run.py --deployment-pipeline --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
python run.py --deployment-pipeline --config deployment_a100.yaml
python run.py --deploy-pipeline --config <NAME_OF_CONFIG_IN_CONFIGS_FOLDER>
python run.py --deploy-pipeline --config deployment_a100.yaml
```

The `feature_engineering` and `deployment` pipeline can be run simply with the `default` stack, but the training pipelines [stack](https://docs.zenml.io/user-guide/production-guide/understand-stacks) will depend on the config.
Expand Down Expand Up @@ -127,7 +127,7 @@ python run.py --deployment-pipeline --config deployment_a100.yaml

A working prototype has been trained and deployed as of Jan 19 2024. The model is using minimal data and finetuned using QLoRA and PEFT. The model was trained using 1 A100 GPU on the cloud:

- Training dataset [Link](https://huggingface.co/datasets/htahir1/zenml-codegen-v1)
- Training dataset [Link](https://huggingface.co/datasets/zenml/zenml-codegen-v1)
- PEFT Model [Link](https://huggingface.co/htahir1/peft-lora-zencoder15B-personal-copilot/)
- Fully merged model (Ready to deploy on HuggingFace Inference Endpoints) [Link](https://huggingface.co/htahir1/peft-lora-zencoder15B-personal-copilot-merged)

Expand All @@ -147,7 +147,7 @@ The [ZenML Pro](https://zenml.io/pro) was used to manage the pipelines, models,

This project recently did a [call of volunteers](https://www.linkedin.com/feed/update/urn:li:activity:7150388250178662400/). This TODO list can serve as a source of collaboration. If you want to work on any of the following, please [create an issue on this repository](https://github.com/zenml-io/zenml-projects/issues) and assign it to yourself!

- [x] Create a functioning data generation pipeline (initial dataset with the core [ZenML repo](https://github.com/zenml-io/zenml) scraped and pushed [here](https://huggingface.co/datasets/htahir1/zenml-codegen-v1))
- [x] Create a functioning data generation pipeline (initial dataset with the core [ZenML repo](https://github.com/zenml-io/zenml) scraped and pushed [here](https://huggingface.co/datasets/zenml/zenml-codegen-v1))
- [x] Deploy the model on a HuggingFace inference endpoint and use it in the [VS Code Extension](https://github.com/huggingface/llm-vscode#installation) using a deployment pipeline.
- [x] Create a functioning training pipeline.
- [ ] Curate a set of 5-10 repositories that are using the ZenML latest syntax and use data generation pipeline to push dataset to HuggingFace.
Expand Down
1 change: 1 addition & 0 deletions llm-finetuning/configs/deployment_a10.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
settings:
docker:
requirements: requirements.txt
python_package_installer: "uv"

model:
name: peft-lora-zencoder15B-personal-copilot
Expand Down
1 change: 1 addition & 0 deletions llm-finetuning/configs/deployment_a100.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
settings:
docker:
requirements: requirements.txt
python_package_installer: "uv"

model:
name: "peft-lora-zencoder15B-personal-copilot"
Expand Down
1 change: 1 addition & 0 deletions llm-finetuning/configs/deployment_t4.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# environment configuration
settings:
docker:
python_package_installer: "uv"
requirements: requirements.txt

model:
Expand Down
9 changes: 5 additions & 4 deletions llm-finetuning/configs/finetune_aws.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@
settings:
docker:
requirements: requirements.txt
python_package_installer: "uv"

model:
name: "peft-lora-zencoder15B-personal-copilot"
description: "Fine-tuned `starcoder15B-personal-copilot-A100-40GB-colab` for ZenML pipelines."
audience: "Data Scientists / ML Engineers"
audience: "Data Scientists / ML Engineers"
use_cases: "Code Generation for ZenML MLOps pipelines."
limitations: "There is no guarantee that this model will work for your use case. Please test it thoroughly before using it in production."
trade_offs: "This model is optimized for ZenML pipelines. It is not optimized for other libraries."
Expand All @@ -23,13 +24,13 @@ steps:
step_operator: sagemaker-eu
settings:
step_operator.sagemaker:
estimator_args:
estimator_args:
instance_type: "ml.p4d.24xlarge"

parameters:
args:
model_path: "bigcode/starcoder"
dataset_name: "htahir1/zenml-codegen-v1"
dataset_name: "zenml/zenml-codegen-v1"
subset: "data"
data_column: "content"
split: "train"
Expand Down Expand Up @@ -58,4 +59,4 @@ steps:
use_4bit_qunatization: true
use_nested_quant: true
bnb_4bit_compute_dtype: "bfloat16"
output_peft_repo_id: "htahir1/peft-lora-zencoder15B-personal-copilot"
output_peft_repo_id: "zenml/peft-lora-zencoder15B-personal-copilot"
7 changes: 4 additions & 3 deletions llm-finetuning/configs/finetune_gcp.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@
settings:
docker:
requirements: requirements.txt
python_package_installer: "uv"

model:
name: "peft-lora-zencoder15B-personal-copilot"
description: "Fine-tuned `starcoder15B-personal-copilot-A100-40GB-colab` for ZenML pipelines."
audience: "Data Scientists / ML Engineers"
audience: "Data Scientists / ML Engineers"
use_cases: "Code Generation for ZenML MLOps pipelines."
limitations: "There is no guarantee that this model will work for your use case. Please test it thoroughly before using it in production."
trade_offs: "This model is optimized for ZenML pipelines. It is not optimized for other libraries."
Expand All @@ -29,7 +30,7 @@ steps:
parameters:
args:
model_path: "bigcode/starcoder"
dataset_name: "htahir1/zenml-codegen-v1"
dataset_name: "zenml/zenml-codegen-v1"
subset: "data"
data_column: "content"
split: "train"
Expand Down Expand Up @@ -58,4 +59,4 @@ steps:
use_4bit_qunatization: true
use_nested_quant: true
bnb_4bit_compute_dtype: "bfloat16"
output_peft_repo_id: "htahir1/peft-lora-zencoder15B-personal-copilot"
output_peft_repo_id: "zenml/peft-lora-zencoder15B-personal-copilot"
1 change: 1 addition & 0 deletions llm-finetuning/configs/finetune_local.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
settings:
docker:
requirements: requirements.txt
python_package_installer: "uv"

model:
name: "peft-lora-zencoder15B-personal-copilot"
Expand Down
10 changes: 8 additions & 2 deletions llm-finetuning/configs/generate_code_dataset.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,20 @@
# environment configuration
settings:
docker:
python_package_installer: "uv"
requirements: requirements.txt
apt_packages:
- git
environment:
HF_HOME: "/tmp/huggingface"
HF_HUB_CACHE: "/tmp/huggingface"

# pipeline configuration
parameters:
dataset_id: htahir1/zenml-codegen-v1
dataset_id: zenml/zenml-codegen-v1

steps:
mirror_repositories:
parameters:
repositories:
- zenml
- zenml
Empty file.
25 changes: 0 additions & 25 deletions llm-finetuning/huggingface/hf_deployment_base_config.py

This file was deleted.

199 changes: 0 additions & 199 deletions llm-finetuning/huggingface/hf_deployment_service.py

This file was deleted.

Loading
Loading