Skip to content

Commit f772fbd

Browse files
committed
address comments
1 parent ba9c60e commit f772fbd

File tree

4 files changed

+4
-19
lines changed

4 files changed

+4
-19
lines changed

docs/source/tutorials/lora_finetune.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -205,8 +205,7 @@ model without any wrappers or custom checkpoint conversion logic.
205205
206206
.. note::
207207
Whenever loading weights with :code:`strict=False`, you should verify that any missing or extra keys in
208-
the loaded :code:`state_dict` are as expected. torchtune's LoRA recipes do this by default via e.g.
209-
:func:`validate_state_dict_for_lora() <torchtune.modules.peft.validate_state_dict_for_lora>` or
208+
the loaded :code:`state_dict` are as expected. torchtune's LoRA recipes do this by default via
210209
:func:`validate_missing_and_unexpected_for_lora() <torchtune.modules.peft.validate_missing_and_unexpected_for_lora>`.
211210

212211
Once we've loaded the base model weights, we also want to set only LoRA parameters to trainable.

docs/source/tutorials/qat_finetune.rst

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -168,11 +168,6 @@ modifications accordingly:
168168
fake_quant_after_n_steps: 1000
169169
memory_efficient_fsdp_wrap: False
170170
171-
.. note::
172-
173-
QAT in torchtune is currently not compatible with `memory_efficient_fsdp_wrap <https://pytorch.org/torchtune/stable/generated/torchtune.utils.get_full_finetune_fsdp_wrap_policy.html#torchtune.utils.get_full_finetune_fsdp_wrap_policy>`_.
174-
This is a known issue and will be fixed in a future torchtune version.
175-
176171
Empirically, we observed that disabling fake quantization for the first N steps
177172
led to better results, presumably because doing so allows the weights to stabilize
178173
before we start introducing quantization noise to the fine-tuning process.

torchtune/modules/peft/_utils.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -260,10 +260,9 @@ def validate_missing_and_unexpected_for_lora(
260260
"""
261261
A more memory-efficient way to validate that LoRA state dict loading was done properly.
262262
263-
Similar to :func:`validate_state_dict_for_lora`, this function uses a model's LoRA config to
264-
check that LoRA and/or base model weights are loaded into the full model correctly.
265-
Unlike that function, this method relies only on the values of missing and unexpected
266-
as returned by the load_state_dict API with strict=False. This allows us to do the
263+
This function uses a model's LoRA config to check that LoRA and/or base model weights
264+
are loaded into the full model correctly. This function relies only on the values of missing and
265+
unexpected as returned by the load_state_dict API with strict=False. This allows us to do the
267266
validation without any additional calls to .state_dict(), which use additional memory.
268267
269268
Args:

torchtune/modules/peft/lora.py

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -91,14 +91,6 @@ def __init__(
9191
self.lora_a = nn.Linear(in_features=in_dim, out_features=rank, bias=False)
9292
self.lora_b = nn.Linear(in_features=rank, out_features=out_dim, bias=False)
9393
self.merged = False
94-
# Note: FSDP's meta device initialization contract assumes that a module's
95-
# reset_parameters method only initializes its own parameters (i.e. no child
96-
# params are initialized, as is done in initialize_parameters below).
97-
# For that reason, we patch reset_parameters directly on lora_a and lora_b submodules
98-
# when using meta device. This is done in
99-
# torchtune.training.prepare_model_for_fsdp_with_meta_device.
100-
# See this issue for more details: https://github.com/pytorch/pytorch/issues/104187.
101-
# Without meta device, we only need the following:
10294
self.initialize_parameters()
10395

10496
def initialize_parameters(self):

0 commit comments

Comments
 (0)