Exposure risk calculation algorithm #24

Stypox · 2020-07-04T21:53:13Z

For now I have added javadocs to risk calculation parameters and functions, containing references to Google and Apple's developer websites.
I used androidx @Nullable and @NonNull, is it ok? Jetbrains annotations were red since we are not using a jetbrains library afaik.
I also added default values for minimumRiskScore and durationAtAttenuationThresholds in ExposureConfiguration, as described in Apple documentation.

Stypox · 2020-07-04T21:53:44Z

Writing documentation helped me organize ideas, and I realised there is something strange: apparently the ExposureConfiguration class contains a transmissionRiskScores field that contains values with user-defined meaning. So I am not sure how they have to be handled by the risk calculation algorithm, since they are user-defined.

BjoernPetersen · 2020-07-06T13:57:12Z

Using AndroidX nullability annotations is perfectly fine. I think that was actually the only file in this project the Jetbrains annotations were used.

As for the transmissionRiskScores: I'm not sure yet how the right transmissionRiskScore is selected for each encounter, but it seems we'll just multiply the scores in the end, so we don't need to understand the semantics of the individual values.

RiskScore = attenuationScore * daysSinceLastExposureScore * durationScore * transmissionRiskScore

Stypox · 2020-07-06T18:16:41Z

I added the risk calculation algorithm: it is surely not in the correct place and with the correct semantics, but it isn't difficult to move code if needed ;-)

theScrabi · 2020-07-07T06:08:45Z

How do we know the algorithm works correct? We don't have test vectors and can't simply extract them from the original app like we did with Sk and RPIs for the CryptoModul.

Or can we? Would it be possible?

BjoernPetersen · 2020-07-07T14:13:12Z

Please change the ExposureConfiguration class to be immutable.

Change the visibility of all fields to private
Copy any arrays before returning them in public getters
Don't modify an existing configuration in the ExposureConfigurationBuilder, but use mutable fields in the builder instead

Stypox · 2020-07-07T19:28:10Z

@BjoernPetersen done! ;-)
I did not create public getters for score arrays, since those should in theory only be used inside the class, so I didn't need to copy any array.

Containing references to Google and Apple's developer websites

Stypox · 2020-07-08T16:57:25Z

I found the ApplicationConfiguration used by CWA, it can be found at this url: https://svc90.main.px.t-online.de/version/v1/configuration/country/DE/app_config (obtained from coronawarnapp.service.diagnosiskey.DiagnosisKeyConstants.COUNTRY_APPCONFIG_DOWNLOAD_URL). I converted the provided binary file to an ApplicationConfiguration object by calling the autogenerated coronawarnapp.server.protocols.ApplicationConfigurationOuterClass.ApplicationConfigurationOuterClass.ApplicationConfiguration.parseFrom with the bytes of the file.

ApplicationConfiguration

app_version {
  android {
    latest {
      major: 1
      patch: 4
    }
    min {
      major: 1
      patch: 4
    }
  }
  ios {
    latest {
      minor: 8
      patch: 2
    }
    min {
      minor: 5
    }
  }
}
attenuation_duration {
  risk_score_normalization_divisor: 25
  thresholds {
    lower: 55
    upper: 63
  }
  weights {
    low: 1.0
    mid: 0.5
  }
}
exposure_config {
  attenuation {
    gt10_le15_dbm: LOWEST
    gt10_le15_dbm_value: 1
    gt15_le27_dbm: LOWEST
    gt15_le27_dbm_value: 1
    gt27_le33_dbm: LOWEST
    gt27_le33_dbm_value: 1
    gt33_le51_dbm: LOWEST
    gt33_le51_dbm_value: 1
    gt51_le63_dbm: LOWEST
    gt51_le63_dbm_value: 1
    gt63_le73_dbm: LOWEST
    gt63_le73_dbm_value: 1
    lt10_dbm: LOWEST
    lt10_dbm_value: 1
  }
  attenuation_weight: 50.0
  days_since_last_exposure {
    ge0_lt2_days: MEDIUM_HIGH
    ge0_lt2_days_value: 5
    ge10_lt12_days: MEDIUM_HIGH
    ge10_lt12_days_value: 5
    ge12_lt14_days: MEDIUM_HIGH
    ge12_lt14_days_value: 5
    ge14_days: MEDIUM_HIGH
    ge14_days_value: 5
    ge2_lt4_days: MEDIUM_HIGH
    ge2_lt4_days_value: 5
    ge4_lt6_days: MEDIUM_HIGH
    ge4_lt6_days_value: 5
    ge6_lt8_days: MEDIUM_HIGH
    ge6_lt8_days_value: 5
    ge8_lt10_days: MEDIUM_HIGH
    ge8_lt10_days_value: 5
  }
  days_weight: 20.0
  duration {
    gt10_le15_min: LOWEST
    gt10_le15_min_value: 1
    gt15_le20_min: LOWEST
    gt15_le20_min_value: 1
    gt20_le25_min: LOWEST
    gt20_le25_min_value: 1
    gt25_le30_min: LOWEST
    gt25_le30_min_value: 1
    gt30_min: LOWEST
    gt30_min_value: 1
  }
  duration_weight: 50.0
  transmission {
    app_defined1: LOWEST
    app_defined1_value: 1
    app_defined2: LOW
    app_defined2_value: 2
    app_defined3: LOW_MEDIUM
    app_defined3_value: 3
    app_defined4: MEDIUM
    app_defined4_value: 4
    app_defined5: MEDIUM_HIGH
    app_defined5_value: 5
    app_defined6: HIGH
    app_defined6_value: 6
    app_defined7: VERY_HIGH
    app_defined7_value: 7
    app_defined8: HIGHEST
    app_defined8_value: 8
  }
  transmission_weight: 50.0
}
min_risk_score: 11
risk_score_classes {
  risk_classes {
    label: "LOW"
    max: 15
    url: "https://www.coronawarn.app"
  }
  risk_classes {
    label: "HIGH"
    max: 72
    min: 15
    url: "https://www.coronawarn.app"
  }
}

From here an ExposureConfiguration object can be built (I copied the code in coronawarnapp.service.applicationconfiguration.ApplicationConfigurationService.mapRiskScoreToExposureConfiguration to do this):

ExposureConfiguration<
  minimumRiskScore: 11,
  attenuationScores: [0, 1, 1, 1, 1, 1, 1, 1],
  attenuationWeight: 50,
  daysSinceLastExposureScores: [5, 5, 5, 5, 5, 5, 5, 5],
  daysSinceLastExposureWeight: 50,
  durationScores: [0, 0, 0, 1, 1, 1, 1, 1],
  durationWeight: 50,
  transmissionRiskScores: [1, 2, 3, 4, 5, 6, 7, 8],
  transmissionRiskWeight: 50,
  durationAtAttenuationThresholds: [55, 63]
>

As you can see there is something wrong with the values: attenuationScores is 0 and then all 1s, daysSinceLastExposureScores is all 5, transmissionRiskScores is numbers 1 to 8. So either the CWA developers are not using the risk calculation algorithm to its full extent, or I am not requesting the data correctly (all other fields seem reasonable, though). What do you think?

Kotlin code to obtain the above results

        var exportBinary: ByteArray? = /* the bytes of the `export.bin` file contained in the downloaded zip*/;

        var appConfig: ApplicationConfigurationOuterClass.ApplicationConfiguration =
            ApplicationConfigurationOuterClass.ApplicationConfiguration.parseFrom(exportBinary)

        println(appConfig.toString())

        var config: ExposureConfiguration = ExposureConfiguration
            .ExposureConfigurationBuilder()
            .setTransmissionRiskScores(
                appConfig.exposureConfig.transmission.appDefined1Value,
                appConfig.exposureConfig.transmission.appDefined2Value,
                appConfig.exposureConfig.transmission.appDefined3Value,
                appConfig.exposureConfig.transmission.appDefined4Value,
                appConfig.exposureConfig.transmission.appDefined5Value,
                appConfig.exposureConfig.transmission.appDefined6Value,
                appConfig.exposureConfig.transmission.appDefined7Value,
                appConfig.exposureConfig.transmission.appDefined8Value
            )
            .setDurationScores(
                appConfig.exposureConfig.duration.eq0MinValue,
                appConfig.exposureConfig.duration.gt0Le5MinValue,
                appConfig.exposureConfig.duration.gt5Le10MinValue,
                appConfig.exposureConfig.duration.gt10Le15MinValue,
                appConfig.exposureConfig.duration.gt15Le20MinValue,
                appConfig.exposureConfig.duration.gt20Le25MinValue,
                appConfig.exposureConfig.duration.gt25Le30MinValue,
                appConfig.exposureConfig.duration.gt30MinValue
            )
            .setDaysSinceLastExposureScores(
                appConfig.exposureConfig.daysSinceLastExposure.ge14DaysValue,
                appConfig.exposureConfig.daysSinceLastExposure.ge12Lt14DaysValue,
                appConfig.exposureConfig.daysSinceLastExposure.ge10Lt12DaysValue,
                appConfig.exposureConfig.daysSinceLastExposure.ge8Lt10DaysValue,
                appConfig.exposureConfig.daysSinceLastExposure.ge6Lt8DaysValue,
                appConfig.exposureConfig.daysSinceLastExposure.ge4Lt6DaysValue,
                appConfig.exposureConfig.daysSinceLastExposure.ge2Lt4DaysValue,
                appConfig.exposureConfig.daysSinceLastExposure.ge0Lt2DaysValue
            )
            .setAttenuationScores(
                appConfig.exposureConfig.attenuation.gt73DbmValue,
                appConfig.exposureConfig.attenuation.gt63Le73DbmValue,
                appConfig.exposureConfig.attenuation.gt51Le63DbmValue,
                appConfig.exposureConfig.attenuation.gt33Le51DbmValue,
                appConfig.exposureConfig.attenuation.gt27Le33DbmValue,
                appConfig.exposureConfig.attenuation.gt15Le27DbmValue,
                appConfig.exposureConfig.attenuation.gt10Le15DbmValue,
                appConfig.exposureConfig.attenuation.lt10DbmValue
            )
            .setMinimumRiskScore(appConfig.minRiskScore)
            .setDurationAtAttenuationThresholds(
                appConfig.attenuationDuration.thresholds.lower,
                appConfig.attenuationDuration.thresholds.upper
            )
            .build()

        println(config);

BjoernPetersen · 2020-07-09T12:44:12Z

I think those values seem about right. At least from a technical standpoint, the values in your ExposureConfiguration object match the data in the ApplicationConfiguration you provided. While it strikes me as odd that exposures that happened more than 14 days ago are scored the same as others, I suppose those won't happen if old contacts are regularly purged from the database.

All in all your guess about the CWA devs not fully utilizing the nuances and possibilities of the framework is probably correct, but as long as we can handle their config, that doesn't seem like a problem to me.

Stypox · 2020-07-10T09:37:40Z

This is the ExposureConfiguration for Immuni (the italian app), and it looks as strange as the CWA one, so I guess govenments decided not to fine-tune risk score calculation. (obtained from https://get.immuni.gov.it/v1/settings?platform=android&build=1300000 )

"exposure_configuration": {
    "attenuation_thresholds": [ 50, 70 ],
    "attenuation_bucket_scores": [ 0, 0, 5, 5, 5, 5, 5, 5 ],
    "attenuation_weight": 1,
    "days_since_last_exposure_bucket_scores": [ 1, 1, 1, 1, 1, 1, 1, 1 ],
    "days_since_last_exposure_weight": 1,
    "duration_bucket_scores": [ 0, 0, 0, 0, 5, 5, 5, 5 ],
    "duration_weight": 1,
    "transmission_risk_bucket_scores": [ 1, 1, 1, 1, 1, 1, 1, 1 ],
    "transmission_risk_weight": 1,
    "minimum_risk_score": 1
},

Stypox · 2020-07-10T09:58:57Z

What I still don't understand is why GMS can calculate an ExposureSummary based on the data from an exposure even though it never asks the framework user to provide a definition for the various transmissionRiskScore values. Since those are user-defined values, there should be some kind of way to provide GMS with a function/lambda to determine the transmissionRiskLevel corresponding to an exposure, that could then be used by GMS to access the transmissionRiskScore array at the correct index and thus calculate the final risk score.
Instead, at least from what I see here (I keep referencing the italian app since everything seems clearer to me, since basically all GMS calls are in one file) the transmissionRiskLevel is provided directly by GMS in an ExposureInformation object from the client.getExposureInformation(token) call. So apparently this value should be calculated by the framework, not by its user, even though the documentation suggests the opposite. I can't find any documentation online about how this transmissionRiskLevel should be calculated, does someone know where to find such info?

mh- · 2020-07-10T20:19:09Z

@Stypox TRL is an input as well as an output to the matching.
As an input it is attached to each Diagnosis Key. If one of the keys matches, the TRL of that key is used as an index into the 1st row of the ExposureConfiguration table.
So it is an input into the calculation of the Exposure Risk Value.
The TRL of the matching key is also returned as an output to the app, so that the app could use the information for its own additional risk calculation.
Does this answer your question?

mh- · 2020-07-10T20:30:22Z

Note however that Google is modifying this, see https://developers.google.com/android/exposure-notifications/exposure-notifications-api#data-structures google/exposure-notifications-server#663 and https://developers.google.com/android/exposure-notifications/exposure-key-file-format
TRL is deprecated as of v1.5, which means that the calculation will be done differently than explained above (which covers v1.0).
I have no idea however if / when RKI plans to switch to v1.5. Well, maybe I should simply ask...

BTW: Apple might also be modifying this, but it's not publicly visible - however there are strange artefacts:

haitrec · 2020-07-10T20:37:31Z

I just did some more reading. Hopefully the following information helps.

First, some terminology from the EN framework docs:

TransmissionRiskLevel(s):

states defined in natural language
example: "Confirmed test - High transmission risk level"
this information is added to my keys before I upload them to the server

TemporaryExposureKey:

has a field transmissionRiskLevel, filled before uploading, used after downloading the key

TransmissionRiskScore(s):

an array of int values mapped to the different TransmissionRiskLevels
used in the actual risk calculation
the mapping from TransmissionRiskLevels is provided to the EN framework by the user/app (details for the german CWA below)
the user provided mapping from TransmissionRiskLevels is evaluated inside the EN framework
each score has an alias, e.g. RISK_SCORE_LOWEST for transmissionRiskScores[0]

Now implementation details from the CWA code:

There exists an ApplicationConfigurationService class .
Inside, an ApplicationConfiguration is acquired from a web server (the official server I guess) via asyncGetApplicationConfigurationFromServer() from the WebRequestBuilder

In the code, the ApplicationConfiguration type is imported via:
import de.rki.coronawarnapp.server.protocols.ApplicationConfigurationOuterClass.ApplicationConfiguration
This (probably) means that it is the ApplicationConfiguration defined in the applicationConfiguration.proto file.

After retreiving this ApplicationConfiguration object, the ApplicationConfigurationService uses it to build an ExposureConfiguration object. That object then is used for RetrieveDiagnosisKeysTransactions. More precisely, during such transactions, the list of diagnosis keys is fetched from the server, then passed to the client wrapper via its asyncRetrieveExposureConfiguration(...) call, together with the ExposureConfiguration object created before. The client wrapper passes both to the EN framework.

TL;DR:

The application configuration with the mapping between risk levels and scores is fetched from the server in the ApplicationConfigurationService. An ExposureConfiguration is built out of this and later on passed to the EN framework.

mh- · 2020-07-10T20:53:12Z

@haitrec the TRL is not used by CWA like it’s recommended in the spec you mentioned. Instead a profile is used that maps a value (1..8) to each key, based on its age.

mh- · 2020-07-11T07:43:13Z

ExposureConfiguration<
  minimumRiskScore: 11,
  attenuationScores: [0, 1, 1, 1, 1, 1, 1, 1],
  attenuationWeight: 50,
  daysSinceLastExposureScores: [5, 5, 5, 5, 5, 5, 5, 5],
  daysSinceLastExposureWeight: 50,
  durationScores: [0, 0, 0, 1, 1, 1, 1, 1],
  durationWeight: 50,
  transmissionRiskScores: [1, 2, 3, 4, 5, 6, 7, 8],
  transmissionRiskWeight: 50,
  durationAtAttenuationThresholds: [55, 63]
As you can see there is something wrong with the values: attenuationScores is 0 and then all 1s, daysSinceLastExposureScores is all 5, transmissionRiskScores is numbers 1 to 8. So either the CWA developers are not using the risk calculation algorithm to its full extent, or I am not requesting the data correctly (all other fields seem reasonable, though). What do you think?

I think the values are correct, and the CWA developers simply implemented the RKI risk estimation concept.

This concept places importance on the Transmission Risk - therefore the (currently) 13 uploaded Diagnosis Keys get TRL assigned based on their age. This TRL is just mapped 1->1, 2->2, 3->3 etc using the values above.

Attenuation and duration are also important inputs, but they are handled inside the CWA app. This is explained here. So the app does not make the framework do the most important parts of the calculations for this, but does them itself.

I think version v1.5 of the API with the new "ExposureWindow" concept goes into this direction: Let the apps do the complete risk calculations however they want to do them, and keep only the privacy-preserving-parts (like hiding exact timestamps) in the framework.

taken from https://developer.apple.com/documentation/exposurenotification/enexposureconfiguration

Stypox · 2020-07-13T14:07:48Z

Thank you @mh- and @haitrec for your explanations, now my ideas are clearer :-D
I guess until we don't find an app that uses v1.5 we can't implement it, since we wouldn't know the details. The approach used in this PR should work fine for now and be flexible for later changes, since the risk calculation algorithm takes as parameters only the values contained in an ExposureInformation and every field is gettable if an app wants to manually calculate things. I think we can leave the ExposureWindow implementation for later (when we'll have more information) and only focus on v1 for now.
In the latest commit I added a test for the algorithm with the only "test data" I could find (i.e. the example on Apple's documentation), and the test succeeds. A part from the fact that more tests would be needed, this PR is ready in my opinion.

BjoernPetersen

While this certainly isn't perfect yet, partially due to lack of info from Google/Apple, I think this is good enough for now. I'll merge this now so we can start developing the SDK parts that rely on it, we can improve on it later.

@Stypox Thank you for the contribution, especially on such a hard-to-get-your-head-around topic.
Also thanks to everyone who provided the very valuable input in this thread!

ljl-covid · 2020-07-21T14:56:11Z

Since @Stypox already mentioned the Italian Immuni app, I thought it may be worth pointing out that Ireland's contact tracing app that uses the Google/Apple API was also released as open source and donated to the Linux Foundation, with code available on GitHub, with, AIUI, separate code to communicate with the APIs.

Stypox force-pushed the risk branch 2 times, most recently from 45b23d0 to 16a9c3e Compare July 4, 2020 22:23

Stypox added 4 commits July 7, 2020 21:28

Add javadocs to risk calculation parameters and functions

991a6dd

Containing references to Google and Apple's developer websites

Add functions to get intermediary risk scores for values

d8991dc

Add getRiskScore() to calculate the final risk score for a set of values

6b5a383

Make ExposureConfiguration instances immutable

8bfd5f7

Stypox force-pushed the risk branch from 7785bfb to 8bfd5f7 Compare July 7, 2020 19:29

theScrabi mentioned this pull request Jul 10, 2020

[DISCUSSION] Solution without closed source dependencys corona-warn-app/cwa-app-android#75

Closed

Add test for risk score algorithm

36b288f

taken from https://developer.apple.com/documentation/exposurenotification/enexposureconfiguration

BjoernPetersen approved these changes Jul 21, 2020

View reviewed changes

BjoernPetersen merged commit 340e8db into CoraLibre:master Jul 21, 2020

theScrabi mentioned this pull request Aug 14, 2020

Implementation of exposure risk calculating algorithm #20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exposure risk calculation algorithm #24

Exposure risk calculation algorithm #24

Stypox commented Jul 4, 2020 •

edited

Loading

Stypox commented Jul 4, 2020 •

edited

Loading

BjoernPetersen commented Jul 6, 2020

Stypox commented Jul 6, 2020

theScrabi commented Jul 7, 2020 •

edited

Loading

BjoernPetersen commented Jul 7, 2020

Stypox commented Jul 7, 2020

Stypox commented Jul 8, 2020 •

edited

Loading

BjoernPetersen commented Jul 9, 2020

Stypox commented Jul 10, 2020

Stypox commented Jul 10, 2020

mh- commented Jul 10, 2020

mh- commented Jul 10, 2020 •

edited

Loading

haitrec commented Jul 10, 2020

mh- commented Jul 10, 2020

mh- commented Jul 11, 2020

Stypox commented Jul 13, 2020

BjoernPetersen left a comment

ljl-covid commented Jul 21, 2020

Exposure risk calculation algorithm #24

Exposure risk calculation algorithm #24

Conversation

Stypox commented Jul 4, 2020 • edited Loading

Stypox commented Jul 4, 2020 • edited Loading

BjoernPetersen commented Jul 6, 2020

Stypox commented Jul 6, 2020

theScrabi commented Jul 7, 2020 • edited Loading

BjoernPetersen commented Jul 7, 2020

Stypox commented Jul 7, 2020

Stypox commented Jul 8, 2020 • edited Loading

BjoernPetersen commented Jul 9, 2020

Stypox commented Jul 10, 2020

Stypox commented Jul 10, 2020

mh- commented Jul 10, 2020

mh- commented Jul 10, 2020 • edited Loading

haitrec commented Jul 10, 2020

First, some terminology from the EN framework docs:

Now implementation details from the CWA code:

TL;DR:

mh- commented Jul 10, 2020

mh- commented Jul 11, 2020

Stypox commented Jul 13, 2020

BjoernPetersen left a comment

Choose a reason for hiding this comment

ljl-covid commented Jul 21, 2020

Stypox commented Jul 4, 2020 •

edited

Loading

Stypox commented Jul 4, 2020 •

edited

Loading

theScrabi commented Jul 7, 2020 •

edited

Loading

Stypox commented Jul 8, 2020 •

edited

Loading

mh- commented Jul 10, 2020 •

edited

Loading