-
Notifications
You must be signed in to change notification settings - Fork 21
Label_Microservice worker should use the new Model classes #85
Conversation
* Most of the inference logic in the Worker class can be deleted and we can instead use the newly created Model classes. * Use an in cluster deployment service * Log the model and thresholds. * cli.py is a simple CLI to send pubsub requests * It is ended for development and debugging. * Create a utility to parse issue specs of the form {org}/{repo}#number * We need to initialize IssueLabelPredictor() inside the pubsub thread; otherwise we have threading issues when trying to access the model. * The universal model needs to apply appropriate thresholds to its predictions. * Worker no longer needs to apply thresholding of predictions because that is now handled by the individual model classes. * Add a method to worker apply_repo_config that handles repo specific behavior specified in the YAML file in the repo * Apply aliases to the labels * If the config specifies which labels to predict then filter out all other labels. * When commenting on issues about the predictions, the bot should reate a markdown table with the probabilities for the labels in the comment. * Remove test_worker.py; the test is no longer relevant after the refactor. There is a TODO in worker.py to create an appropriate unittest for the new code. * Change how the GitHub App Private Keys are handled. * The private key should be mounted as a secret and an environment variable should specify the path of the PEM key. * Stop storing the PEM key as an environment variable and then in python code writing it to a file. * Change the name of the environment variables to add the prefix "GITHUB_" to make it more explanatory. * Move some of the helper methods for creating GitHubApp out of github_util and into the GitHubApp class Related to: kubeflow#70 combine universal and repo specific models.
/assign @hamelsmu |
if not m: | ||
return None, None, None | ||
return m.group(1), m.group(2), int(m.group(3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe return a named tuple? Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry looks like this got merged before I saw this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor question otherwise,
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hamelsmu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm Going ahead and stamping with LGTM b/c only had a minor question which is optional |
Most of the inference logic in the Worker class can be deleted and
we can instead use the newly created Model classes.
Use an in cluster deployment service
Log the model and thresholds.
cli.py is a simple CLI to send pubsub requests
Create a utility to parse issue specs of the form {org}/{repo}#number
We need to initialize IssueLabelPredictor() inside the pubsub thread;
otherwise we have threading issues when trying to access the model.
The universal model needs to apply appropriate thresholds to its predictions.
Worker no longer needs to apply thresholding of predictions because that is
now handled by the individual model classes.
Add a method to worker apply_repo_config that handles repo specific behavior
specified in the YAML file in the repo
labels.
When commenting on issues about the predictions, the bot should reate a markdown table with the probabilities for the labels in the comment.
Remove test_worker.py; the test is no longer relevant after the refactor.
There is a TODO in worker.py to create an appropriate unittest for the new
code.
Change how the GitHub App Private Keys are handled.
The private key should be mounted as a secret and an environment variable
should specify the path of the PEM key.
Stop storing the PEM key as an environment variable and then in python code writing it to a file.
Change the name of the environment variables to add the prefix "GITHUB_"
to make it more explanatory.
Move some of the helper methods for creating GitHubApp out of github_util
and into the GitHubApp class
Related to: #70 combine universal and repo specific models.
This change is