Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using auto-validations to help with user quality inference #29

Open
jonfroehlich opened this issue Jul 18, 2019 · 11 comments
Open

Using auto-validations to help with user quality inference #29

jonfroehlich opened this issue Jul 18, 2019 · 11 comments
Assignees
Labels
discussion Proposing Ideas or discussions enhancement New feature or request

Comments

@jonfroehlich
Copy link
Member

I'd like us to investigate how we might be able to incorporate the auto-validator CV algorithm in helping us predict performance.

(Also, how many labels per user are necessary before the auto-validator becomes useful. Somewhat related to #27.)

@nch0w
Copy link
Contributor

nch0w commented Aug 1, 2019

The CV confidence ranges from 0-100. The higher the confidence, the more the CV model thinks its prediction is correct.

We tried to use CV confidence for each label to predict whether it is correct.

Screenshot from 2019-08-01 14-54-51
Each plot has two histograms, one for the CV confidence of correct labels, and one for the CV confidence of incorrect labels. We can see that if a CurbRamp label has a CV confidence < 40, then it is probably incorrect, for example.

@jonfroehlich
Copy link
Member Author

jonfroehlich commented Aug 1, 2019 via email

@nch0w
Copy link
Contributor

nch0w commented Aug 1, 2019

We also predict that if the CV label type matches the user label type, then the label is probably correct.

The rows represent CV labels, the columns represent user labels, and the values represent the probability that a label with that specific CV label and user label is correct.

          CR          NCR          O           SP
CR:  [0.93353028, 0.95294118, 0.9118541 , 0.89325843],
NCR: [0.87033748, 0.91358025, 0.9109589 , 0.89855072],
O:   [0.63453815, 0.5483871 , 0.59813084, 0.66071429],
SP:  [0.69811321, 0.6875    , 0.69662921, 0.74647887]]

@jonfroehlich
Copy link
Member Author

jonfroehlich commented Aug 1, 2019 via email

@nch0w
Copy link
Contributor

nch0w commented Aug 2, 2019

I found that CV predictions are not reliable for predicting the accuracy of a label. The histograms show that there is not much correlation between CV confidence and the accuracy of a label. We also expected that if the CV agrees with human's label, then the label is more likely to be accurate, but as shown in the table, this is not true.

The CV model as it stands must be refined before it is useful for auto-validations.

@jonfroehlich
Copy link
Member Author

jonfroehlich commented Aug 2, 2019 via email

@nch0w
Copy link
Contributor

nch0w commented Aug 2, 2019

Yes, it could be useful for predicting the accuracy of CurbRamp labels. But keep in mind that 92.5% of CurbRamp labels are correct anyways.

We used the DC model.

@nch0w
Copy link
Contributor

nch0w commented Aug 12, 2019

Here are plots updated with new predictions from Devesh.

Screenshot from 2019-08-12 10-13-25

          CR          NCR          O           SP
CR:  [0.91929825, 0.90588235, 0.89781022, 0.88343558],
NCR: [0.82674772, 0.88870432, 0.90204082, 0.86631016],
O:   [0.53398058, 0.504,      0.50246305, 0.56],
SP:  [0.64705882, 0.63694268, 0.7,        0.6 ]]

@nch0w
Copy link
Contributor

nch0w commented Aug 12, 2019

According to the plots, I don't think CV is very useful for predicting user accuracy yet.

@jonfroehlich
Copy link
Member Author

jonfroehlich commented Aug 12, 2019 via email

@nch0w
Copy link
Contributor

nch0w commented Aug 13, 2019

FYI, if we want to use CV to predict user accuracy, we will also need to run it on all the labels. I only have predictions for ~4,000 labels out of the 65,700 total labels.

@daotyl000 daotyl000 added discussion Proposing Ideas or discussions enhancement New feature or request labels Aug 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Proposing Ideas or discussions enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants