Skip to content

Conversation

EduardDurech
Copy link
Contributor

@EduardDurech EduardDurech commented Aug 18, 2025

Pre-release of Apertus from the Swiss AI Initiative

Main modifications from Llama

  • xIELU Activation
  • QK-norm

Associated Transformers PR huggingface/transformers#39381

Co-author: @AllenHaoHuang

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added the new-model Requests to new models label Aug 18, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the Apertus model and its xIELU activation function. The changes include a new model definition for Apertus, which correctly implements QK-norm, and a new XIELU activation layer. The model is also added to the test suite and model registries.

My review focuses on the implementation of the new activation function. I've found a critical issue in the XIELU's forward method that would prevent the use of its CUDA kernel under torch.compile, leading to a performance degradation. The provided suggestion should resolve this issue. The rest of the implementation appears to be correct and follows vLLM's conventions.

@EduardDurech EduardDurech force-pushed the model/apertus branch 3 times, most recently from 732dd4d to e26ab90 Compare August 18, 2025 21:35
Copy link

mergify bot commented Aug 26, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @EduardDurech.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Aug 26, 2025
@mergify mergify bot removed the needs-rebase label Aug 26, 2025
@EduardDurech EduardDurech marked this pull request as ready for review August 27, 2025 23:24
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing! I think the implementation looks good overall but please fix pre-commit

@EduardDurech
Copy link
Contributor Author

EduardDurech commented Aug 28, 2025

@DarkLight1337 I'm not sure what the errors are, does it just want to convert to forward_native and forward_cuda? Or to change to NN.Module if we keep forward? The activation is still in development so not sure the right way to do it

Comments @AllenHaoHuang?

@DarkLight1337
Copy link
Member

I think you can just assert self._xielu_cuda_obj is not None

@martinjaggi
Copy link

martinjaggi commented Aug 28, 2025

just a quick comment that hugging face transformers has already merged it into their main branch now, which should make next steps here easier?

also beta and eps additional parameters are now also saved in the HF checkpoint. is this taken into account here yet?

@EduardDurech
Copy link
Contributor Author

@martinjaggi yes all my implementations are the most up to date

Co-authored-by: AllenHaoHuang <allenhuangdd@gmail.com>
Signed-off-by: EduardDurech <39579228+EduardDurech@users.noreply.github.com>
Signed-off-by: EduardDurech <39579228+EduardDurech@users.noreply.github.com>
@EduardDurech
Copy link
Contributor Author

@DarkLight1337 not sure what the current issues are but fixed pre-commit

@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 28, 2025
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) August 28, 2025 17:30
@DarkLight1337
Copy link
Member

Fastcheck is optional so you can ignore it

@EduardDurech
Copy link
Contributor Author

@DarkLight1337 I think the PR ci is just because huggingface/transformers#39381 was only recently merged

Signed-off-by: EduardDurech <39579228+EduardDurech@users.noreply.github.com>
auto-merge was automatically disabled August 29, 2025 04:04

Head branch was pushed to by a user without write access

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) August 29, 2025 04:34
Signed-off-by: EduardDurech <39579228+EduardDurech@users.noreply.github.com>
auto-merge was automatically disabled August 29, 2025 09:56

Head branch was pushed to by a user without write access

@DarkLight1337 DarkLight1337 merged commit 1cf3753 into vllm-project:main Aug 29, 2025
42 checks passed
@EduardDurech
Copy link
Contributor Author

@DarkLight1337 we have a small fix for activation parameter loading for the model #24100

eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025
Signed-off-by: EduardDurech <39579228+EduardDurech@users.noreply.github.com>
Co-authored-by: AllenHaoHuang <allenhuangdd@gmail.com>
vermouth1992 pushed a commit to volcengine/verl that referenced this pull request Sep 13, 2025
Pre-release of Apertus from the Swiss AI Initiative

Main modifications from Llama

- xIELU Activation
- QK-norm

Associated Transformers PR
huggingface/transformers#39381
Associated vLLM PR vllm-project/vllm#23068
Associated SGLang PR sgl-project/sglang#9774

GSM8K
<img width="430" height="262" alt="image"
src="https://github.com/user-attachments/assets/8b2d5188-834b-4a8c-828e-2d0aa2ccffed"
/>
<img width="436" height="266" alt="image"
src="https://github.com/user-attachments/assets/57241a73-3150-474a-a4fb-222e33a0de08"
/>
wlf-darkmatter pushed a commit to wlf-darkmatter/verl that referenced this pull request Sep 13, 2025
Pre-release of Apertus from the Swiss AI Initiative

Main modifications from Llama

- xIELU Activation
- QK-norm

Associated Transformers PR
huggingface/transformers#39381
Associated vLLM PR vllm-project/vllm#23068
Associated SGLang PR sgl-project/sglang#9774

GSM8K
<img width="430" height="262" alt="image"
src="https://github.com/user-attachments/assets/8b2d5188-834b-4a8c-828e-2d0aa2ccffed"
/>
<img width="436" height="266" alt="image"
src="https://github.com/user-attachments/assets/57241a73-3150-474a-a4fb-222e33a0de08"
/>
@omaraljasem272144
Copy link

ا

VocabVictor pushed a commit to VocabVictor/verl-plus that referenced this pull request Sep 24, 2025
Pre-release of Apertus from the Swiss AI Initiative

Main modifications from Llama

- xIELU Activation
- QK-norm

Associated Transformers PR
huggingface/transformers#39381
Associated vLLM PR vllm-project/vllm#23068
Associated SGLang PR sgl-project/sglang#9774

GSM8K
<img width="430" height="262" alt="image"
src="https://github.com/user-attachments/assets/8b2d5188-834b-4a8c-828e-2d0aa2ccffed"
/>
<img width="436" height="266" alt="image"
src="https://github.com/user-attachments/assets/57241a73-3150-474a-a4fb-222e33a0de08"
/>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: EduardDurech <39579228+EduardDurech@users.noreply.github.com>
Co-authored-by: AllenHaoHuang <allenhuangdd@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants