Skip to content

Onboard to the GitHub workflow based issue-labeler #7964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 8, 2025

Conversation

jeffhandley
Copy link
Member

We have a new implementation of dotnet/issue-labeler available that is implemented entirely using GitHub workflows. This approach allows for self-service onboarding and re-training of the prediction models. Onboarding instructions are documented and anyone with write permission to this repository should be able to work through the process of the initial model training, but @RussKie and I are both familiar with the process.

I ran a test set of training from this repository into my fork. The workflow run can be seen here, and the model's accuracy looks like it'll start with a good baseline. It took less than 12 minutes to train the model.

  • Issues tested: 2834
    • The predicted label matched the existing label: 2706 (95.48 %)
    • The predicted label does not match the existing label: 46 (1.62 %)
    • No prediction was made: 82 (2.89 %)
    • A prediction was made, but no existing label was present: 0 (0.00 %)
  • Pulls tested: 2885
    • The predicted label matched the existing label: 2795 (96.88 %)
    • The predicted label does not match the existing label: : 74 (2.56 %)
    • No prediction was made: 16 (0.55 %)
    • A prediction was made, but no existing label was present: 0 (0.00 %)

@jeffhandley jeffhandley self-assigned this Mar 8, 2025
@Copilot Copilot AI review requested due to automatic review settings March 8, 2025 03:58
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR introduces a new GitHub workflow-based implementation of the issue-labeler with self-service onboarding and retraining of prediction models. Key changes include the addition of workflows for training, caching, predicting for issues and pull requests, promoting models, and building the predictor app.

Reviewed Changes

File Description
.github/workflows/labeler-train.yml Defines training parameters and triggers using workflow_dispatch inputs
.github/workflows/labeler-cache-retention.yml Schedules cache retention tasks with a conditional run on dotnet repos
.github/workflows/labeler-predict-pulls.yml Sets up pull request label prediction with conditional logic
.github/workflows/labeler-predict-issues.yml Configures issue label prediction with workflow_dispatch inputs
.github/workflows/labeler-promote.yml Adds workflow for promoting staged models to the live environment
.github/workflows/labeler-build-predictor.yml Provides a build workflow for the predictor app

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Copy link
Member

@danmoseley danmoseley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay! thanks

@danmoseley danmoseley merged commit cf14299 into dotnet:main Mar 8, 2025
135 checks passed
@danmoseley danmoseley added the area-engineering-systems infrastructure helix infra engineering repo stuff label Mar 8, 2025
@jeffhandley jeffhandley deleted the jeffhandley/issue-labeler branch March 8, 2025 23:37
@jeffhandley
Copy link
Member Author

I fired off the job to train the models. I set it to train directly into the LIVE cache entries to take effect immediately upon completion.

https://github.com/dotnet/aspire/actions/runs/13742535771

@jeffhandley
Copy link
Member Author

Training took less than 10 minutes and all is working. New issues and pull requests will be labeled. If you want to have it backfill existing issue/pull labels, you can manually trigger the respective jobs. I'd expect it to be able to label everything in history in 30 minutes or less.

The FAQ · dotnet/issue-labeler Wiki has notes about when you would need to retrain.

@danmoseley
Copy link
Member

seems to be working backfilling.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-engineering-systems infrastructure helix infra engineering repo stuff
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants