-
Notifications
You must be signed in to change notification settings - Fork 344
Follow up on Exposure Notification API testing by Fraunhofer IIS #394
Comments
Sorry for not replying earlier - we just reached out again to Fraunhofer IIS for additional information, as also mentioned in #312 (comment). Mit freundlichen Grüßen/Best regards, |
Hello @kbobrowski, thanks for starting the discussion. I know it has been a while since you have asked, and therefore I am happy to get back to you today with an answer from the RKI. Thank you, Corona-Warn-App Open Source Team |
@SebastianWolf-SAP @GPclips thanks for keeping us in the loop. I just noticed (via community Slack) a blog post comparing Fraunhofer IIS study and study done by prof. Leith team:
Using accuracy value here (which was 79% in Fraunhofer IIS study) may be misleading, as this was a study with highly uneven class distribution (ground truth was 103 contacts with no exposure and 40 contacts with exposure). Even if CWA did not work at all (no contacts registered), the accuracy value would be 103 / (103 + 40) = 72%. Recall or F1 score provide less-misleading information about true performance of the system, and these values were lower (47% and 56%). Would be great to have some comparison between the next Fraunhofer IIS study and prof. Leith study, using comparable statistical values. I would also suggest to refrain in public communication from quoting accuracy values without mentioning uneven class distribution, as it may easily mislead the reader, or at least quote also reference accuracy for non-functioning app (72%). |
cc @doug-leith |
(@kbobrowski SwissCovid seems to have suffered from the same problems in the public reporting of accuracy/recall DP-3T/bt-measurements#4 ) |
@pdehaye I think the problem with reporting accuracy is that many understand "accuracy" in this context as "how well the app detects exposures", not as "how well the app stays silent if there is no exposure + how well the app detects exposures" which is the correct interpretation. But the main difference between prof. Leith study and Fraunhofer IIS study is that the former was in enclosed metal space (tram / bus), and the latter in the large open space - not really comparable. |
Hello @kbobrowski and community, I just reached out to the RKI asking for the September test results. We will inform you as soon as we can and keep you in the loop. Thanks, Corona-Warn-App Open Source Team |
@GPclips are there any news wrt the Fraunhofer followup study you can share with us? 🙂 |
This issue is probably no longer relevant after ENFv2 was integrated into the app. I don't know if it would make sense to also discuss more recent tests on the ENA here and it would be best to create a new issue for that. Corona-Warn-App Open Source Team |
@heinezen I am still interested to learn the results from the September Fraunhofer tests as the underlying technology on the fundamental level (BLE attenuation as distance proxy) didn't change between ENFv1 and ENFv2 and those test should at least provide some insight into that. Additionally if there are any new test they would also be interesting of course 🙂. |
@daimpi I'll try to get a hold of the results :) Corona-Warn-App Open Source Team |
Your Question
Thanks for uploading API testing documentation! If I understand it correctly, the tests were performed using 55dB and 60dB attenuation buckets with weights [1.0, 0.5, 0.0]. This resulted in quite low recall of 47%, meaning that only half of relevant contacts with infected user were properly registered. Assuming current adoption rate of 20% this is further reduced to only 10% chance of being notified about the contact with infected person, assuming everyone is using the app properly all the time and always uploads keys.
After the results of this test RKI made a change of increasing bucket threshold from 60dB to 63dB, but this seems like a small modification. Were there any further tests to confirm how this change affected recall value? Perhaps third bucket could be used to register contacts also in the situations with heavy shadowing? This is related to "yellow card" proposal (https://github.com/corona-warn-app/cwa-app-android/issues/899#issuecomment-663475550)
The text was updated successfully, but these errors were encountered: