[QEff Finetune]: Refactor the finetune main call #289

vbaddi · 2025-02-27T10:24:14Z

Refactor the finetune main api
Add support to override the PEFT config (yaml/json)
Add support to validate the correctness of PEFT Config
Some nit changes

r: 16
lora_alpha: 64
target_modules:
  - q_proj
  - v_proj
  - k_proj
bias: none
task_type: CAUSAL_LM
lora_dropout: 0.1

Command:

python -m QEfficient.cloud.finetune \
    --model_name "meta-llama/Llama-3.2-1B" \
    --lr 5e-4 \
    --peft_config_file "lora_config.yaml"

Using Default LoRA Config:

python -m QEfficient.cloud.finetune \
    --model_name "meta-llama/Llama-3.2-1B" \
    --lr 5e-4

ochougul · 2025-04-15T16:04:36Z

copy of #314

quic-mamta · 2025-04-21T08:08:58Z

tests/finetune/test_finetune.py

 @pytest.mark.on_qaic
-@pytest.mark.skip(reason="eager docker not available in sdk")
 @pytest.mark.parametrize(


Please put pytest markers @pytest.mark.finetune and @pytest.mark.cli It will help in executing the test in stages.

quic-mamta · 2025-04-21T08:10:22Z

tests/finetune/test_finetune.py

-    finetune(**kwargs)
+    results = finetune(**kwargs)
+
+    assert np.allclose(results["avg_train_prep"], 1.002326, atol=1e-5), "Train perplexity is not matching."


avg_train_prep to be changed to avg_train_metric wrt changes in PR 292

quic-mamta · 2025-04-21T08:27:38Z

QEfficient/finetune/utils/train_utils.py

@@ -40,7 +40,7 @@ def train(
    optimizer,
    lr_scheduler,
    gradient_accumulation_steps,
-    train_config: TRAIN_CONFIG,
+    train_config: TrainConfig,
    device,


No need of passing all three train_config.gradient_accumulation_steps, train_config and train_config.device, only train_config is enough.

Signed-off-by: vbaddi <quic_vbaddi@quicinc.com> Signed-off-by: Meet Patel <quic_meetkuma@quicinc.com>

Signed-off-by: Meet Patel <quic_meetkuma@quicinc.com>

QEfficient/finetune/utils/config_utils.py

quic-swatia · 2025-04-23T07:13:23Z

QEfficient/finetune/utils/config_utils.py

+        - Ensures types match expected values (int, float, list, etc.).
+    """
+    if config_type.lower() != "lora":
+        raise ValueError(f"Unsupported config_type: {config_type}. Only 'lora' is supported.")


Since we are not doing lora finetuning in case of BERT, it will raise error.

quic-swatia · 2025-04-23T07:50:38Z

QEfficient/finetune/utils/config_utils.py

+
+    Args:
+        config_data (Dict[str, Any]): The configuration dictionary loaded from YAML/JSON.
+        config_type (str): Type of config to validate ("lora" for LoraConfig, default: "lora").


Need to add field in config_type corresponding to BERT as we don't do lora fine tuning in it.

quic-mamta · 2025-04-23T09:14:55Z

QEfficient/cloud/finetune.py

+    #  local_args = {k: v for k, v in locals().items() if v is not None and k != "peft_config_file" and k != "kwargs"}
+    update_config(train_config, **kwargs)
+
+    lora_config = LoraConfig()


this line is not required.

quic-mamta · 2025-04-23T09:22:46Z

QEfficient/cloud/finetune.py

-        longest_seq_length, _ = get_longest_seq_length(train_dataloader.dataset)
+        lora_config = LoraConfig()
+
+    update_config(lora_config, **kwargs)


why do need to update lora_config here with kwargs?

quic-swatia · 2025-04-23T10:46:02Z

tests/finetune/test_finetune.py

    train_config = args[0]
+    assert max_train_step >= train_config.gradient_accumulation_steps, (
+        "Total training step should be more than 4 which is gradient accumulation steps."


In place of '4', please pass train_config.gradient_accumulation_steps instead. In case, user passes some different value for train_config.gradient_accumulation_steps, 4 will be confusing.

quic-swatia · 2025-04-23T11:07:37Z

tests/finetune/test_finetune.py

    train_config = args[0]
+    assert max_train_step >= train_config.gradient_accumulation_steps, (


This assertion will fail. #107 should only be validated if max_train_step >0 as the default value for max_train_step is 0. Please refer : https://github.com/quic/efficient-transformers/blob/main/QEfficient/finetune/utils/train_utils.py#L174

In line 24 , max_train_step value is set to 20, so this assertion is correct, but line 24 can be changed to max_train_step = 20 for interpretability, similarly for other params also.

quic-swatia · 2025-04-23T12:07:39Z

QEfficient/cloud/finetune.py

@@ -44,132 +51,139 @@
 warnings.filterwarnings("ignore")


-def main(**kwargs):
+def setup_distributed_training(config: TrainConfig) -> None:


Better to use variable name train_config in place of config to maintain uniformity in the code. Different names can cause confusion.

quic-swatia · 2025-04-23T12:10:37Z

QEfficient/cloud/finetune.py


-        if not hasattr(model, "base_model_prefix"):
-            raise RuntimeError("Given huggingface model does not have 'base_model_prefix' attribute.")
+def load_model_and_tokenizer(config: TrainConfig) -> tuple[AutoModelForCausalLM, AutoTokenizer]:


Better to use variable name train_config in place of 'config' to maintain uniformity in the code. Different names can cause confusion.

vbaddi self-assigned this Feb 27, 2025

vbaddi requested review from quic-rishinr and ochougul as code owners February 27, 2025 10:24

vbaddi force-pushed the add_peft_yaml_path branch from 2f19722 to 48061ee Compare February 27, 2025 13:57

vbaddi requested review from quic-mamta and quic-swatia March 19, 2025 08:27

vbaddi added the enhancement New feature or request label Mar 19, 2025

quic-swatia force-pushed the main branch from 32651d5 to cd9a6b9 Compare March 20, 2025 09:58

quic-meetkuma force-pushed the add_peft_yaml_path branch from 3ff66eb to c0d2315 Compare April 9, 2025 09:12

quic-amitraj marked this pull request as draft April 11, 2025 08:44

ochougul closed this Apr 15, 2025

quic-mamta reopened this Apr 16, 2025

quic-meetkuma force-pushed the add_peft_yaml_path branch from 7f2d367 to b2ee39a Compare April 17, 2025 11:00

quic-mamta requested changes Apr 21, 2025

View reviewed changes

quic-mamta reviewed Apr 21, 2025

View reviewed changes

quic-meetkuma force-pushed the add_peft_yaml_path branch 5 times, most recently from b8182a6 to d0fff22 Compare April 21, 2025 11:24

quic-meetkuma added 3 commits April 21, 2025 16:58

refactor the finetune main __call__

a802eac

Signed-off-by: vbaddi <quic_vbaddi@quicinc.com> Signed-off-by: Meet Patel <quic_meetkuma@quicinc.com>

Fixed FT test case on qaic

fbd389b

Signed-off-by: Meet Patel <quic_meetkuma@quicinc.com>

Updated test case based on recent commits

e27deeb

Signed-off-by: Meet Patel <quic_meetkuma@quicinc.com>

quic-meetkuma force-pushed the add_peft_yaml_path branch from d0fff22 to e27deeb Compare April 21, 2025 11:28

quic-swatia reviewed Apr 23, 2025

View reviewed changes

quic-mamta reviewed Apr 23, 2025

View reviewed changes

quic-swatia reviewed Apr 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QEff Finetune]: Refactor the finetune main call #289

[QEff Finetune]: Refactor the finetune main call #289

vbaddi commented Feb 27, 2025 •

edited

Loading

ochougul commented Apr 15, 2025

quic-mamta Apr 21, 2025

quic-mamta Apr 21, 2025

quic-mamta Apr 21, 2025

quic-swatia Apr 23, 2025

quic-swatia Apr 23, 2025

quic-mamta Apr 23, 2025

quic-mamta Apr 23, 2025 •

edited

Loading

quic-swatia Apr 23, 2025

quic-swatia Apr 23, 2025 •

edited

Loading

quic-mamta Apr 23, 2025 •

edited

Loading

quic-swatia Apr 23, 2025

quic-swatia Apr 23, 2025

		train_config = args[0]
		assert max_train_step >= train_config.gradient_accumulation_steps, (

[QEff Finetune]: Refactor the finetune main __call__ #289

Are you sure you want to change the base?

[QEff Finetune]: Refactor the finetune main __call__ #289

Conversation

vbaddi commented Feb 27, 2025 • edited Loading

Using Default LoRA Config:

ochougul commented Apr 15, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

quic-mamta Apr 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

quic-swatia Apr 23, 2025 • edited Loading

Choose a reason for hiding this comment

quic-mamta Apr 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

[QEff Finetune]: Refactor the finetune main call #289

[QEff Finetune]: Refactor the finetune main call #289

vbaddi commented Feb 27, 2025 •

edited

Loading

quic-mamta Apr 23, 2025 •

edited

Loading

quic-swatia Apr 23, 2025 •

edited

Loading

quic-mamta Apr 23, 2025 •

edited

Loading