Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Qwen2.5 configs #1999

Merged
merged 5 commits into from
Nov 13, 2024
Merged

Conversation

joecummings
Copy link
Contributor

@joecummings joecummings commented Nov 13, 2024

  1. I turned activation checkpointing off for all 0.5B models and for 1.5B LoRA models. No point.
  2. I turned on memory logging

Everything else is cosmetic.

Copy link

pytorch-bot bot commented Nov 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1999

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8596e5a with merge base 18d97f0 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 13, 2024

# Model arguments
model:
_component_: torchtune.models.qwen2_5.qwen2_5_0_5b
Copy link
Contributor

@ebsmothers ebsmothers Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that you're making the filename change 0_5B -> 0.5B for consistency with other configs, but honestly would prefer to just move everything to 0_5B so it matches the builders (doesn't have to be in this PR though)

Copy link
Contributor

@felipemello1 felipemello1 Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, i would prefer if we avoided using periods for names, and only use them when they are a path or file type

Copy link
Contributor

@felipemello1 felipemello1 Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for llama3.2 we added 3.2 to the path like you did, but not the components

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same thing we have here, no?

Copy link
Contributor

@calvinpelletier calvinpelletier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we use underscores instead of periods?: #1863 (comment)

log_every_n_steps: 1
log_peak_memory_stats: False

# Profiler (disabled)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

controversial take but if we want consistency we should leave these in. idc too much for this PR but I thought that was the whole point of a bunch of @felipemello1's changes. Either way would like to compress this config substantially in a separate PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with making the profiler simpler is a separate PR

Copy link
Contributor

@ebsmothers ebsmothers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate decimal points

@joecummings
Copy link
Contributor Author

joecummings commented Nov 13, 2024

I thought we use underscores instead of periods?: #1863 (comment)

Yeah this is a misleading comment. We do use underscores for model builders, but the model should just get downloaded to a directory with the exact same name as the model on the Hub.

@joecummings joecummings merged commit 1eb7785 into pytorch:main Nov 13, 2024
16 checks passed
@joecummings joecummings deleted the update-qwen2.5-stuff branch November 13, 2024 19:32
joecummings added a commit that referenced this pull request Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants