Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[megatron] feat: support qwen2 megatron backend #261

Merged
merged 8 commits into from
Feb 19, 2025

Conversation

kinman0224
Copy link
Contributor

@kinman0224 kinman0224 commented Feb 13, 2025

Support Qwen2 Megatron backend

The code is primarily adapted from the llama folder, with modifications to use QKV bias and remove the rope_scaling of RoPE in verl/models/qwen2/megatron/layers/parallel_attention.py.

  • Train using Qwen2-7B-Instruct with PPO, GSM8k score can reach 0.87 at step 75.
  • not support saver now

@kinman0224 kinman0224 marked this pull request as draft February 13, 2025 02:25
@kinman0224 kinman0224 marked this pull request as ready for review February 13, 2025 03:47
@@ -282,6 +282,7 @@ def mistral_megatron_weight_loader(actor_weights: Dict, vllm_model: nn.Module) -
'LlamaForCausalLM': llama_megatron_weight_loader, # use te backend for open-source megatron
'LLaMAForCausalLM': llama_megatron_weight_loader,
'MistralForCausalLM': mistral_megatron_weight_loader,
'Qwen2ForCausalLM': llama_megatron_weight_loader,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you test them all? maybe only including v0.6.3 is sufficient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have tested the loaders on v0.4.2, v0.5.3, and v0.6.3. But the score was only tested on v0.6.3. Maybe i remove the it in v0.4.2 and v0.5.3?

@@ -0,0 +1,42 @@
set -x
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could u later also add a section to https://github.com/volcengine/verl/blob/main/docs/experiment/ppo.rst , add a new table for MATH, and include the logs, command, and test score in next PR

@vermouth1992
Copy link
Collaborator

Could you add QWen 2.5 0.5b to the CI?

@kinman0224 kinman0224 marked this pull request as draft February 15, 2025 03:58
@Viper403
Copy link

Viper403 commented Feb 17, 2025

Hi, Dude~ Thank you for your work. Is there any plan to support saver for Qwen2?

@Viper403
Copy link

@kinman0224
Copy link
Contributor Author

Yes, I have noticed that. I am working on it.

@kinman0224 kinman0224 marked this pull request as ready for review February 18, 2025 16:29
@kinman0224
Copy link
Contributor Author

kinman0224 commented Feb 18, 2025

It can now handle tie_word_embedding=True.

The result will be updated in next PR.

@vermouth1992 vermouth1992 enabled auto-merge (squash) February 19, 2025 14:17
@vermouth1992 vermouth1992 merged commit 9448762 into volcengine:main Feb 19, 2025
12 checks passed
return layer_map


def merge_megatron_ckpt_llama(wrapped_models, config, is_value_model=False, dtype='bf16'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the function name here should be a typo? (llama -> qwen2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, here is a typo. This PR has not supported the saver yet, but it may support it in the future.

@kinman0224 kinman0224 deleted the kinman/feat_qwen_megatron branch February 20, 2025 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants