Update templates after v0.5.8 `llmforge` release #391

SumanthRH · 2024-11-08T21:41:53Z

What does this PR do?

Updates workspace templates after v0.5.8 release of llmforge . Product release has happened already so this PR can be safely merged.

Some important changes in this version:

checkpoint_every_n_epochs is deprecated in favour of checkpoint_and_evaluation_frequency
max_num_checkpoints is deprecated.
awsv2 -> aws in RemoteStoragePath. This is because of awsv2 cli deprecation. Since RemoteStoragePath is a bleeding edge feature, this is hard deprecation.
TorchCompileConfig is here.

Also, we added liger support in the previous release - 0.5.7 but that was not added until now. This PR also adds liger to our configs.

FWIW, torch compile + liger has some subtleties around compatibility - the best configuration for perf is turning on all liger flags, so this PR only adds liger to the configs.

We've added liger only to the Lora configs since it's been hard to test out with A100s for full param.

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH · 2024-11-13T23:48:35Z

Screenshots after testing Liger (from @erictang000 ) :

SumanthRH · 2024-11-13T23:51:07Z

...fine-tune-llm_v2/training_configs/custom/meta-llama/Meta-Llama-3-70B/lora/8xA100-80G-4k.yaml

@kouroshHakha just wanna highlight that this is direct edit to the existing config.

I think having a separate config with liger enabled is also doable, but given that we've tested out liger extensively regarding correctness, I'm fine with having this be in the defaults to squeeze out more performance - A lot of optionality is also confusing to the user.

kouroshHakha

Let's make liger versions a separate config. Mostly because there is an initialization time involved and it's not that faster either. So overall it's slower for 70B at least.

I also didn't notice where compile params are. Where are they?

kouroshHakha · 2024-11-15T01:17:25Z

...ine-tune-llm_v2/training_configs/custom/meta-llama/Meta-Llama-3-8B/lora/4xA10-512-wandb.yaml

+    rope: True
+    swiglu: True
+    cross_entropy: True
+    fused_linear_cross_entropy: False


Make a comment on why flc is false or why rms norm is false.

kouroshHakha · 2024-11-15T01:18:16Z

templates/llm-router/README.ipynb

@@ -1400,6 +1402,25 @@
    "This plot illustrates that as we relax the cost constraints (i.e., increase the percentage of GPT-4 calls), the performance improves. While the performance of a random router improves linearly with cost, our router achieves significantly better results at each cost level."
   ]
  },
+  {


what is this?

I updated the router template to use the new 0.5.8 image. I noticed that the cell execution numbers are all messed up in the notebook, so I copied over some cleanup code from the E2E LLM Workflows template to cleanup cell nums and cached checkpoints.

SumanthRH · 2024-11-15T21:01:50Z

...ates/fine-tune-llm_v2/training_configs/custom/meta-llama/Meta-Llama-3-8B/lora/4xA10-512.yaml

+    fused_linear_cross_entropy: False
+


@erictang000 umm did this value change? why was this false again?

discussed on slack, for lower context length + batch size, regular cross entropy can be faster and memory is similar. Example w/ this toggled for llama-3.2-1b. But easier to just use defaults w/ liger, so we can turn fused linear cross entropy on in our default configs.

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

erictang000

lgtm

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

…m/anyscale/templates into sumanthrh/update-templates-v0.5.8

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH and others added 6 commits November 8, 2024 13:38

update router template after v0.5.8

d7e3f7f

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

minor updates

9369d94

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

add liger to default config

0a2853f

update liger configs

e176331

add lora to 8xa100 lora 70b config

f3d9f80

x

6454865

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH marked this pull request as ready for review November 13, 2024 23:47

SumanthRH requested a review from kouroshHakha November 13, 2024 23:48

SumanthRH commented Nov 13, 2024

View reviewed changes

kouroshHakha reviewed Nov 15, 2024

View reviewed changes

SumanthRH commented Nov 15, 2024

View reviewed changes

address comments

b0281d7

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

erictang000 approved these changes Nov 20, 2024

View reviewed changes

SumanthRH added 4 commits November 20, 2024 14:56

x

0a0c8dc

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

Merge branch 'sumanthrh/update-templates-v0.5.8' of https://github.co…

22b33db

…m/anyscale/templates into sumanthrh/update-templates-v0.5.8

x

1a9490a

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

x

8d12ca7

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

kouroshHakha approved these changes Nov 20, 2024

View reviewed changes

SumanthRH merged commit b631ed8 into main Nov 20, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update templates after v0.5.8 `llmforge` release #391

Update templates after v0.5.8 `llmforge` release #391

SumanthRH commented Nov 8, 2024 •

edited

Loading

SumanthRH commented Nov 13, 2024

SumanthRH Nov 13, 2024

kouroshHakha left a comment

kouroshHakha Nov 15, 2024

kouroshHakha Nov 15, 2024

SumanthRH Nov 15, 2024

SumanthRH Nov 15, 2024

erictang000 Nov 15, 2024

erictang000 left a comment

Update templates after v0.5.8 llmforge release #391

Update templates after v0.5.8 llmforge release #391

Conversation

SumanthRH commented Nov 8, 2024 • edited Loading

What does this PR do?

SumanthRH commented Nov 13, 2024

SumanthRH Nov 13, 2024

Choose a reason for hiding this comment

kouroshHakha left a comment

Choose a reason for hiding this comment

kouroshHakha Nov 15, 2024

Choose a reason for hiding this comment

kouroshHakha Nov 15, 2024

Choose a reason for hiding this comment

SumanthRH Nov 15, 2024

Choose a reason for hiding this comment

SumanthRH Nov 15, 2024

Choose a reason for hiding this comment

erictang000 Nov 15, 2024

Choose a reason for hiding this comment

erictang000 left a comment

Choose a reason for hiding this comment

Update templates after v0.5.8 `llmforge` release #391

Update templates after v0.5.8 `llmforge` release #391

SumanthRH commented Nov 8, 2024 •

edited

Loading