Skip to content

Conversation

@PatrykSaffer
Copy link

@PatrykSaffer PatrykSaffer commented Oct 17, 2025

Purpose

Track expert metrics in MoEs.
This tracks experts usage per layer, rank usage per layer.
This adds ~1-2% overhead e2e.
Example results in grafana:
balancedness

Test Plan

added test test_expert_usage_histogram.py

Test Result

passed

@PatrykSaffer PatrykSaffer marked this pull request as ready for review October 17, 2025 14:04
@mergify mergify bot added the v1 label Oct 17, 2025
@PatrykSaffer PatrykSaffer changed the title expert histogram Expert histogram logging Oct 17, 2025
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR.

Signed-off-by: Patryk Saffer <patryk.saffer99@gmail.com>
@PatrykSaffer PatrykSaffer force-pushed the patryk/expert-usage-histogram branch from 44f77ec to 8328a52 Compare October 17, 2025 14:27
Signed-off-by: Patryk Saffer <patryk.saffer99@gmail.com>
Signed-off-by: Patryk Saffer <patryk.saffer99@gmail.com>
Signed-off-by: Patryk Saffer <patryk.saffer99@gmail.com>
@PatrykSaffer PatrykSaffer changed the title Expert histogram logging [EPLB] Expert histogram logging Oct 29, 2025
@hmellor hmellor linked an issue Oct 30, 2025 that may be closed by this pull request
1 task
@abmfy
Copy link
Member

abmfy commented Nov 11, 2025

This looks pretty cool! Please ping me when it’s ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Record EPLB metric, can in /metric api get info

3 participants