Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_bugbug_labels no longer adds nobug type to regression training data #3396

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

avinashselvam
Copy link

#539

Modified get_bugbug_labels in defect.py to include only those data points that are labelled either regression or bug_no_regression in the training set.

Training the model without changes

72486 non-regression bugs

Cross Validation scores:
Accuracy: f0.9731263445549161 (+/- 0.0012810455820845609)
Precision: f0.9560802008310938 (+/- 0.006503421458310747)
Recall: f0.9316432362619518 (+/- 0.0042866900183067425)

Training the model after changes

71597 non-regression bugs (889 dropped)

Cross Validation scores:
Accuracy: f0.9739072259525028 (+/- 0.0019480324611321944)
Precision: f0.9561803892880535 (+/- 0.006928496874119621)
Recall: f0.9358629670750973 (+/- 0.0045683573571298)

Minor improvement in precision and recall.

Should categories task, enhancement, feature also be removed from the training data for regression?

Please let me know if I have misunderstood the task.

@marco-c
Copy link
Collaborator

marco-c commented Mar 31, 2023

@avinashselvam this is part of the request from #539. The other part is not to consider bugs with type "enhancement" or "task" as label 0 for the regression model.

@suhaibmujahid
Copy link
Member

@avinashselvam are you still interested in working on this? If so, I will be glad to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants