-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using auto-validations to help with user quality inference #29
Comments
Can you provide a more in-depth summary of what you found in this analysis
and the implications for us?
…On Thu, Aug 1, 2019 at 2:58 PM Neil Chowdhury ***@***.***> wrote:
The CV confidence ranges from 0-100. The higher the confidence, the more
the CV model thinks its prediction is correct.
We tried to use CV confidence for each label to predict whether it is
correct.
[image: Screenshot from 2019-08-01 14-54-51]
<https://user-images.githubusercontent.com/17211794/62330097-80b00e00-b46c-11e9-9829-4bbb973cea03.png>
Each plot has two histograms, one for the CV confidence of correct labels,
and one for the CV confidence of incorrect labels. We can see that if a
CurbRamp label has a CV confidence < 40, then it is probably incorrect, for
example.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29?email_source=notifications&email_token=AAML55NBI6TEO2XXDBOUXRDQCNL7FA5CNFSM4IE7UPS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MAFMQ#issuecomment-517472946>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAML55MGRY7FHAF5OILIPBLQCNL7FANCNFSM4IE7UPSQ>
.
--
Jon Froehlich
Associate Professor
Paul G. Allen School of Computer Science & Engineering
University of Washington
http://makeabilitylab.io
@jonfroehlich <https://twitter.com/jonfroehlich> - Twitter
Help make sidewalks more accessible: http://projectsidewalk.io
|
We also predict that if the CV label type matches the user label type, then the label is probably correct. The rows represent CV labels, the columns represent user labels, and the values represent the probability that a label with that specific CV label and user label is correct.
|
I'm still not getting a sense of how this is useful. Can you write up a
~1-2 paragraph summary of your findings to complement the numbers. Can you
articulate: what you found and how this is useful?
…On Thu, Aug 1, 2019 at 3:29 PM Neil Chowdhury ***@***.***> wrote:
We also predict that if the CV label type matches the user label type,
then the label is probably correct.
The rows represent CV labels, the columns represent user labels, and the
values represent the probability that a label with that specific CV label
and user label is correct.
CR NCR O SP
CR: [0.93353028, 0.95294118, 0.9118541 , 0.89325843],
NCR: [0.87033748, 0.91358025, 0.9109589 , 0.89855072],
O: [0.63453815, 0.5483871 , 0.59813084, 0.66071429],
SP: [0.69811321, 0.6875 , 0.69662921, 0.74647887]]
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29?email_source=notifications&email_token=AAML55OC5U6RXGUM3EWTRFLQCNPVLA5CNFSM4IE7UPS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MCDEQ#issuecomment-517480850>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAML55OAV3BYWYKESWBIHITQCNPVLANCNFSM4IE7UPSQ>
.
--
Jon Froehlich
Associate Professor
Paul G. Allen School of Computer Science & Engineering
University of Washington
http://makeabilitylab.io
@jonfroehlich <https://twitter.com/jonfroehlich> - Twitter
Help make sidewalks more accessible: http://projectsidewalk.io
|
I found that CV predictions are not reliable for predicting the accuracy of a label. The histograms show that there is not much correlation between CV confidence and the accuracy of a label. We also expected that if the CV agrees with human's label, then the label is more likely to be accurate, but as shown in the table, this is not true. The CV model as it stands must be refined before it is useful for auto-validations. |
What CV model are you using? Your investigations would depend significantly
on which ML model was used and how it was trained. Also, isn't this finding
far more nuanced than your description implies in that the CV model
performs differently depending on label type (e.g., it's far more accurate
for curb ramp labels).
…On Fri, Aug 2, 2019 at 9:46 AM Neil Chowdhury ***@***.***> wrote:
I found that CV predictions are not reliable for predicting the accuracy
of a label. The histograms show that there is not much correlation between
CV confidence and the accuracy of a label. We also expected that if the CV
agrees with human's label, then the label is more likely to be accurate,
but as shown in the table, this is not true.
The CV model as it stands must be refined before it is useful for
auto-validations.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29?email_source=notifications&email_token=AAML55ISOXIO75NI2LAED7LQCRQETA5CNFSM4IE7UPS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3OIU5Q#issuecomment-517769846>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAML55NFWE3RITWG676PCDLQCRQETANCNFSM4IE7UPSQ>
.
--
Jon Froehlich
Associate Professor
Paul G. Allen School of Computer Science & Engineering
University of Washington
http://makeabilitylab.io
@jonfroehlich <https://twitter.com/jonfroehlich> - Twitter
Help make sidewalks more accessible: http://projectsidewalk.io
|
Yes, it could be useful for predicting the accuracy of CurbRamp labels. But keep in mind that 92.5% of CurbRamp labels are correct anyways. We used the DC model. |
According to the plots, I don't think CV is very useful for predicting user accuracy yet. |
That surprises me. I think you should meet with Galen and discuss your
results.
…On Mon, Aug 12, 2019 at 10:19 AM Neil Chowdhury ***@***.***> wrote:
According to the plots, I don't think CV is very useful for predicting
user accuracy yet.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29?email_source=notifications&email_token=AAML55OTTDIRJF5J3SFT3UDQEGLRRA5CNFSM4IE7UPS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4DG5WY#issuecomment-520515291>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAML55MWMBVA4SLALXKKG2DQEGLRRANCNFSM4IE7UPSQ>
.
--
Jon Froehlich
Associate Professor
Paul G. Allen School of Computer Science & Engineering
University of Washington
http://makeabilitylab.io
@jonfroehlich <https://twitter.com/jonfroehlich> - Twitter
Help make sidewalks more accessible: http://projectsidewalk.io
|
FYI, if we want to use CV to predict user accuracy, we will also need to run it on all the labels. I only have predictions for ~4,000 labels out of the 65,700 total labels. |
I'd like us to investigate how we might be able to incorporate the auto-validator CV algorithm in helping us predict performance.
(Also, how many labels per user are necessary before the auto-validator becomes useful. Somewhat related to #27.)
The text was updated successfully, but these errors were encountered: