Skip to content
This repository has been archived by the owner on Dec 30, 2022. It is now read-only.

Latest commit

 

History

History
234 lines (138 loc) · 33.3 KB

Traffic Analysis Mitigation.md

File metadata and controls

234 lines (138 loc) · 33.3 KB

Immuni's Traffic Analysis Mitigation

Table of contents

Introduction

This document details the measures Immuni adopts to mitigate the risk that an attacker may infer information about a user by analysing the encrypted traffic between the user's Mobile Client and the Backend Services. It assumes that you have already read High-Level Description, Technology, and Privacy-Preserving Analytics. Certain terms—written with a capital letter at their beginning—are defined in Technology's glossary.

Our focus is on protecting the interaction between the Mobile Client and two Backend Services: the Exposure Ingestion Service and the Analytics Service. Without appropriate countermeasures, interactions with the former could reveal that a user tested positive for SARS-CoV-2. Similarly, interactions with the latter could reveal that a user was warned about a Risky Exposure.

Exposure Ingestion Service

First, we describe the genuine interactions between the Mobile Client and the Exposure Ingestion Service, including details on how this genuine traffic is configured to hinder traffic analysis. Then, we focus on how dummy traffic is generated to further limit the information an attacker may infer by studying the interactions between the Mobile Client and the Exposure Ingestion Service.

Genuine TEK upload sequences

When a user tests positive for SARS-CoV-2, they can choose to upload their TEKs so that the users at risk can be notified. Within the same upload, they will also send to the server their Province of Domicile and any Epidemiological Info from the previous 14 days.

The operation takes place with the help of an authorised Healthcare Operator. The user navigates to a specific section in the App and dictates the OTP they find there to the Healthcare Operator. The Healthcare Operator enters the OTP into the HIS, unlocking the upload of the data for the user.

Then, the user generally goes through the following steps:

  1. OTP validation. The user taps a button, which causes the Mobile Client to contact the Exposure Ingestion Server and validate that the OTP has been authorised by a Healthcare Operator.
  2. Data upload. After seeing a recap of the data that are about to be sent, the user gives their confirmation and the Mobile Client contacts the Exposure Ingestion Service again, beginning the upload.

To be completed successfully, each step requires at least a server request to be attempted by the Mobile Client.

In reality, there are several reasons why either of these two steps may fail and require repeating. For example, the user could make a mistake, such as attempting to validate the OTP before it is authorised by the Healthcare Operator. Therefore, in some cases, we may witness more than two requests being sent to the Exposure Ingestion Service before the user’s TEKs are uploaded. The user may also quit the process after attempting to validate the OTP. For example, they may have decided that they no longer want to share their data. In that case, the process ends after only one request.

We define the TEK upload sequence as the sequence of any number of server requests attempted in order to complete the OTP validation and data upload steps for the same Mobile Client. A genuine TEK upload sequence always starts with an attempt at validating an OTP. It may end right there or include one or more additional requests.

To make it more difficult for an attacker to infer information from the analysis of the encrypted traffic between the Mobile Client and the Exposure Ingestion Service, and to make it easier to generate credible dummy traffic, the OTP validation and data upload requests are made indistinguishable:

  • Both requests are made to the same IP address resolved by the same domain
  • Both requests have packets of the exact same size

Let us explain how the App makes sure that both types of requests have the same size.

Starting with the data upload requests:

  • nSUM Exposure Detection Summaries are selected from those generated by the Mobile Client over the previous 14 days, prioritising the least recent ones. If fewer than nSUM Exposure Detection Summaries are available from the previous 14 days, all of them will be selected.
  • nINF Exposure Info entries are selected from those belonging to the selected Exposure Detection Summaries, prioritising those with the highest Total Risk Score first, and then the least recent ones. If fewer than nINF Exposure Info entries belonging to the selected Exposure Detection Summaries are available, all of them will be selected.
  • bUN padding bytes are added to make the whole packet—headers included—reach a fixed size of bEIS bytes. bUN is computed every time a new request is to be attempted by subtracting from bEIS the size in bytes of the packet prior to any padding.

Concerning the OTP validation requests, every time that one is to be attempted, bVN padding bytes are added to the payload to reach a total packet size of bEIS bytes.

Please note that bEIS should be considered only a reference value. The actual size of the packet while in transit may differ slightly due to factors that the App does not control fully. However, this difference does not depend on whether the request is for an OTP validation or a data upload. Therefore, the fact that the App cannot guarantee the exact size of these packets while in transit does not hinder the mitigation of risks related to an attacker analysing the encrypted traffic between the Mobile Client and the Exposure Ingestion Service.

Dummy TEK upload sequences

Only users who tested positive for SARS-CoV-2 can initiate a genuine TEK upload sequence. This is the only case in which a Mobile Client generates genuine traffic towards the Exposure Ingestion Service. Hence, without proper countermeasures, an attacker observing such traffic could infer that the user tested positive for SARS-CoV-2.

To prevent this from happening, the Mobile Client generates dummy traffic towards the Exposure Ingestion Service on a regular basis. A dedicated header allows the App to indicate to the Exposure Ingestion Service that a certain request is a dummy one. The Exposure Ingestion Service immediately discards the payload upon receiving a dummy request. The Exposure Ingestion Service waits for a random amount of time (picked from an exponential distribution) before it responds to a dummy request, so as to simulate the response time of a genuine request.

Let us break down the issue of generating effective dummy traffic into the following subproblems:

  • Individual request simulation. We must make individual dummy requests indistinguishable from genuine OTP validation and data upload requests while in transit (by which stage they are encrypted).
  • TEK upload sequence simulation. We must simulate the TEK upload sequence so that it resembles the genuine one closely enough that an attacker cannot discern whether or not a sequence is genuine merely by analysing the sequence itself.
  • Traffic pattern simulation. We must also trigger the simulation of the TEK upload sequence with the proper frequency and within an appropriate overall pattern of network traffic, so that an attacker cannot discern whether or not a sequence is genuine merely by analysing the pattern of TEK upload sequences and other traffic occurring over time.

Individual request simulation

To make the dummy requests indistinguishable from genuine OTP validation or data upload requests while in transit (by which stage they are encrypted), we adopt the following measures:

  • All requests—regardless of whether they are genuine or dummy—are made to the same IP address resolved by the same domain
  • The header is set to the appropriate value for each request, whether dummy or genuine, and both values have the same size
  • The size of the payload of dummy requests mimics that of genuine ones

To implement this last measure, we add bTD random padding bytes to the payload of dummy requests so as to ensure that their packet size is bEIS bytes. This is the size of any genuine OTP validation or data upload packets, as previously described.

TEK upload sequence simulation

Simulating a TEK upload sequence is more difficult than mimicking other interactions between the Mobile Client and the Backend Services. This is because two different requests are involved (OTP validation and data upload). Additionally, the sequence of requests and timings of a genuine TEK sequence depends on the interaction between the user and a Healthcare Operator, which results in a sequence that may include any number of requests with delays of variable length in between.

For our simulation to be effective, we must ensure the following:

  • Most simulated sequences could be genuine. This means that, if an attacker analyses such sequences, they cannot tell that the sequences are not genuine simply based on their publicly observable characteristics. The model powering the simulated sequences must produce credible sequences in almost all cases.
  • Most genuine sequences could be simulated. This means that an attacker cannot be sure that a certain sequence is genuine simply based on its publicly observable characteristics. Most of the possible genuine sequences must be explainable through the model powering the simulated sequences.
  • The distribution of different types of simulated sequences follows that of genuine ones. If a somewhat common genuine sequence is almost never simulated, an attacker observing such a sequence would be able to infer that it is most likely a genuine one.

The diagram in figure 1 represents a simulated TEK upload sequence. The steps involved are as follows:

  • Start. The simulated TEK upload sequence starts.
  • Send request. The Mobile Client sends a dummy request. Whether that request simulates an OTP validation or a data upload is immaterial, as the two cases are indistinguishable based on their publicly observable characteristics (as described earlier). The exit from this step occurs after the request receives a response from the server, or when it times out.
  • Stop. The simulated TEK upload sequence ends.
  • Wait. A delay is added to make the simulated TEK upload sequence more realistic by mimicking user-introduced delays.

Figure 1

Figure 1. The steps involved in a simulated TEK upload sequence.

The model uses the following parameters:

  • dTTR is the value that determines the delay between a server response or timeout and the subsequent request attempt. It is picked from an exponential distribution with mean E(dTTR), a parameter we can use for tuning purposes.
  • pT(i), with i≥0, is the probability that the simulated TEK upload sequence continues, attempting to send a new request, after the (i+1)th request has been completed or has timed out.

Such a model allows the Mobile Client to replicate a wide array of genuine TEK upload sequences. This ensures that most simulated TEK upload sequences could be genuine and most genuine TEK upload sequences could be simulated.

It is also crucial that the most common genuine TEK upload sequences are also common when simulated, and vice versa. The model supports this too.

With a genuine TEK upload sequence, a likely pattern is that a data upload soon follows a correct OTP validation. Another realistic, albeit less likely, outcome is that, after an initial error when validating the OTP, it is validated correctly during the second attempt, and a correct data upload follows. pT(i) helps to model such a pattern. For example, we could set pT(0) to 0.95 and pT(i) to 0.1 for i>1. This configuration makes it likely that the simulated TEK upload sequence will have two requests before stopping, but it also accounts for the possibility that more requests will be simulated. This is to mimic a genuine fail-retry cycle.

Traffic pattern simulation

Let us start by defining two types of sessions:

  • Foreground sessions. These sessions are initiated manually by a user when opening the App.
  • Background sessions. The operating system initiates these sessions following a schedule set by the operating system itself (in the case of iOS) or by the App (in the case of Android). With Immuni, they are leveraged primarily to perform the Exposure Detection (including downloading new TEK Chunks) without having to wait for the user to open the App.

Genuine TEK upload sequences can only occur in foreground sessions, as they must be initiated by the user. The more complex the patterns of traffic that may occur during foreground sessions, the more difficult it becomes to credibly simulate foreground sessions, and, therefore, effectively mitigate the risk that an attacker might discern simulated TEK upload sequences from genuine ones by analysing network traffic. This is why we choose to keep the requests occurring during foreground sessions to a minimum.

If any requests took place predictably during a foreground session in addition to those strictly necessary to perform the TEK upload sequence, any TEK upload sequence taking place without these other expected requests would be identifiable as non-genuine by an attacker. Moreover, if most simulated TEK upload sequences could be identified as non-genuine, the attacker would also know with a high degree of probability that the other TEK upload sequences must be genuine. It is crucial to avoid such a scenario.

For example, if the Mobile Client were to fetch the Configuration Settings every time a user opens the App, this request would have to be simulated as well. Failure to do so would give an attacker a heuristic to distinguish genuine TEK upload sequences from simulated ones: ‘When simulated, TEK upload sequences are not preceded by a call to the Configuration Settings endpoint’.

For the above-mentioned reasons, no requests are performed during foreground sessions besides those related to genuine TEK upload sequences (which we must simulate). There are a few rare exceptions. One such exception is when a user reads the terms of use or the privacy notice, which we assume will occur very infrequently. Noticing that neither of these two requests has occurred shortly before or after a TEK upload sequence does not provide any useful information to an attacker. Instead, in almost all cases, the Mobile Clients leverage background sessions to perform all the necessary server requests other than those belonging to the TEK upload sequence (e.g., Configuration Settings and FAQ fetching).

As briefly mentioned above, there is a difference between the way background sessions are scheduled on iOS and on Android. On iOS, the App has no control over the schedule—the operating system triggers background sessions once every few hours. On Android, the App has full control over the background session schedule. Due to this difference, we handle the simulation of TEK upload sequences differently on iOS and on Android.

iOS

According to our experimental observation (the behaviour is not documented), iOS background sessions are spaced in time by at least a couple of hours. It appears extremely unlikely for a background session to be followed shortly after by another background session. Moreover, when the user starts a foreground session, this appears to introduce a delay to the subsequent background session.

This behavior provides an attacker with a heuristic to identify at least some foreground sessions as such: ‘Any session that started less than two hours after another is likely to be a foreground session’. Since, as explained, in some cases it is possible for an attacker to distinguish between background and foreground sessions by analysing network traffic, simulating TEK upload sequences during background sessions is not an optimal choice. In that case, an attacker may infer that the sequences are simulated, because no genuine TEK upload sequence takes place during a background session. Therefore, TEK upload sequences are simulated during foreground sessions only.

To avoid introducing patterns that an attacker could exploit, the scheduling of simulated TEK upload sequences takes place in a probabilistic way. When the App is first launched, it generates a value dTT,iOS (a timestamp, measured in seconds) from an exponential distribution with mean E(dTT,iOS) (a parameter we can use for tuning purposes), and it stores the current date and time TT (a timestamp, measured in seconds). The App is given an opportunity window to simulate a TEK upload sequence, starting at time TT+dTT,iOS and ending at time TT+dTT,iOS+WT,iOS, where WT,iOS is a fixed parameter representing the duration of the opportunity window (measured in seconds).

Whenever the user starts a foreground session by opening the App, the App checks if the current time falls within the opportunity window. There are three possible outcomes:

  • The current time is before the start of the opportunity window. No simulation occurs.
  • The current time is beyond the opportunity window. The App picks another value dTT,iOS and updates TT to the current date and time. No simulation occurs.
  • The current time falls within the opportunity window. In this case, the App simulates the TEK upload sequence. As soon as the simulation starts, a new dTT,iOS is picked and TT is updated to the current date and time.

When a background session starts, if the current time is beyond the opportunity window, the App picks another value dTT,iOS and updates TT to the current date and time.

In the case that a user almost never opens the App, it may be unlikely that simulated TEK upload sequences are performed for that user. However, this is not a problem. With rare exceptions, no interactions occur between the Mobile Client and the Backend Services during foreground sessions. Therefore, in the case of a TEK upload sequence, an attacker would have no way of knowing the frequency with which a certain user opens the App simply based on analysing that user’s Mobile Client’s traffic. It follows that, when they observe a TEK upload sequence for that Mobile Client, such a piece of information is insufficient for them to determine whether the user actually tested positive for SARS-CoV-2 or whether the sequence was simulated.

If the user initiates a genuine TEK upload sequence when a simulated one is ongoing, an attacker may infer that some of the requests are genuine. To mitigate this risk, rather than starting the simulated TEK upload sequence immediately, the App waits for a random time interval dTTS,iOS from the beginning of the foreground session. This delay is generated from an exponential distribution with mean E(dTTS,iOS), a parameter we can use for tuning purposes.

dTTS,iOS must be a random variable. If it were a fixed parameter, in some cases an attacker might infer that an observed TEK upload sequence is genuine. For example, if dTTS,iOS was always equal to 10 seconds and the attacker observed a request to fetch the terms of use occurring 12 seconds before a TEK upload sequence starts, they would know that the sequence is likely genuine.

If a user navigates to the screen dedicated to initiating the TEK upload sequence, any TEK upload sequence simulations already running are aborted. Any scheduled to occur during the same foreground session are also cancelled. This is to prevent too many requests being sent within the same foreground session, from which an attacker might otherwise conclude that at least one is likely genuine.

Note that this solution does not prevent gainful traffic analysis in the unlikely case that a user starts a TEK upload sequence while a server request of a simulated sequence is occurring. For this to happen, a few conditions need to be met. First, a simulated sequence should start within the same foreground session during which the user performs their genuine sequence. Second, the simulated sequence should not have been completed by the time the user initiates the genuine sequence. Third, when the genuine sequence starts, a request from the simulated one should be ongoing. This last condition is especially unlikely to be verified. This is because most requests only last in the order of seconds or tenths of a second, and any request of a simulated sequence that is not already underway is cancelled the moment the user reaches the screen dedicated to initiating a genuine TEK upload sequence.

While it is possible to mitigate this risk, we deem it too remote to justify further complicating the business logic of the iOS App.

Android

As mentioned earlier, Android makes it possible for an App to choose when a background session should start. Moreover, unlike with the iOS App, there is no interference between foreground sessions and background sessions. Hence, on Android, we can simulate TEK upload sequences during background sessions scheduled specifically for this use, rather than having to wait for the user-initiated foreground sessions. Furthermore, since no server requests are performed during a foreground session other than those related to the TEK upload sequence (with few, rare exceptions), an attacker would not be able to determine that the sequence occurred in the background and, therefore, could not be genuine.

Similarly to iOS, on Android we use TT and dTT,And to determine when the Mobile Client should simulate the TEK upload sequence (that is, when it should schedule the next dedicated background session). These values are set at the first launch of the Android App. On Android, we are simulating the TEK upload sequence in background sessions that occur precisely on a schedule. Therefore, we do not need a parameter defining the duration of an opportunity window, such as WT,iOS for the iOS App. Also, the mean of the variable dTT,And, E(dTT,And), may differ from E(dTT,iOS).

As soon as the dedicated background session starts, the first request of the simulated TEK upload sequence is fired.

The variables TT and dTT,And are updated and the next session scheduled after a simulation of a TEK upload sequence ends. If a previously scheduled background session fails to start at TT+dTT,And, the variables are updated at the earliest opportunity.

In the unlikely event that the planned simulated TEK upload sequence should occur while a foreground session is ongoing, the simulated sequence is cancelled and rescheduled. If a foreground session starts while the simulated sequence is underway, the simulation is immediately aborted.

Analytics Service

We begin by describing the precautions in place to minimise the information an attacker might gain by analysing the genuine traffic between the Mobile Client and the Analytics Service. Then, we describe how we generate dummy traffic to further hinder an attacker.

Genuine analytics uploads

The Mobile Client sends to the Analytics Service data composed of Operational Info and the user’s Province of Domicile. Among these data, the most sensitive is a field of Operational Info: whether the user was notified of a Risky Exposure after the last Exposure Detection. Let us define these data as Operational Info with Exposure when the user was notified of a Risky Exposure after the last Exposure Detection, and as Operational Info without Exposure when they were not.

While all Mobile Clients upload Operational Info without Exposure, only those of users notified of a Risky Exposure upload Operational Info with Exposure. Therefore, if the two types of upload were indistinguishable while in transit (by which stage they are encrypted), an attacker detecting that an upload of Operational Info has taken place would gain no sensitive information about the user.

To hinder an attacker from inferring whether the Mobile Client is uploading Operational Info with Exposure or Operational Info without Exposure, we take the following precautions:

  • Timing and sequencing. The timing of the upload and its sequencing relative to other requests do not depend on the type of Operational Info. In both cases, the upload is attempted straight after the Exposure Detection is performed.
  • Packet size. Neither the request’s headers nor the size of the payload depends on the type of Operational Info. The App uses an integer (1 byte: 0 or 1) rather than a boolean (4 or 5 bytes: true or false) for the field of the JSON-encoded payload determining whether the Operational Info is with Exposure or without Exposure. In the case that no Risky Exposure occurred, a dummy date is entered in the field dedicated to the date when the last Risky Exposure took place. We use a fixed dummy date, as the algorithms we employ for HTTPS encryption encrypt the same payload into a different sequence of bytes for each request, preventing an attacker from identifying in the packet a certain sequence of bytes associated with the dummy date and, therefore, inferring that no Risky Exposure has been detected. For all other fields, we also use integers rather than booleans, or, when using strings, we ensure that all possible values are of equal length. Besides further helping to protect the user’s privacy, this makes it easier to generate credible dummy traffic (more on this below).

Dummy analytics uploads

The measures described above make it virtually impossible for an attacker analysing one encrypted request by the Mobile Client to the Analytics Service to infer whether the request is for an upload of Operational Info with Exposure or Operational Info without Exposure.

However, the rate limit imposed on analytics uploads from the same Mobile Client (see Privacy-Preserving Analytics) exposes the user to a potential attack. The Mobile Client will not attempt more than one upload of Operational Info with Exposure and one of Operational Info without Exposure during the same calendar month. Therefore, an attacker could count the number of upload requests between the Mobile Client and the Analytics Service. If they observed that more than one attempt at uploading analytics data had been performed during the same calendar month, they would be sure that one was as a consequence of a Risky Exposure.

To mitigate this issue, the Mobile Client sends dummy requests that look just like any genuine analytics request while in transit (by which stage they are encrypted). However, they do not contain any analytics data and are used solely to mitigate the risk that an attacker may identify genuine requests.

A deterministic approach, whereby the amount of dummy uploads on a monthly basis is fixed, would not solve the issue. For example, suppose the Mobile Client were to consistently send three dummy requests per calendar month. An attacker who then observed five requests sent in a given month would be certain that the user was notified of a Risky Exposure, due to the per-Mobile-Client rate limit described above.

Instead, we use a probabilistic approach. The model for the upload of dummy analytics allows for the sending of any number of requests within a month. This makes it difficult for an attacker to tell whether or not a Mobile Client uploaded Operational Info with Exposure by simply counting the number of analytics requests sent. To ensure the precaution is effective, these requests’ packets are the same size as those of the genuine ones.

When the App is first launched, it picks a duration dTOD (measured in seconds) from an exponential distribution with mean E(dTOD), a parameter we can use for tuning purposes. It also stores the current date and time TOD (a timestamp, also measured in seconds).

Subsequently, the following takes place after every occasion that an Exposure Detection is performed (note that Exposure Detections do not occur more than once each time the App activates in the background):

  • If the App verifies that the current time is between TOD+dTOD and TOD+dTOD+86400, it attempts to send a dummy request, updates TOD to the current date and time, and picks a new value for dTOD.
  • If, instead, the App recognises that the current time is beyond TOD+dTOD+86400, it updates TOD to the current date and time, picks a new dTOD, but does not attempt to send a dummy request. This can happen if the device was not connected to the Internet between TOD+dTOD and TOD+dTOD+86400, or if the App did not activate in the background during the same period.

In the unlikely event that both a dummy request and a genuine request are scheduled to be sent after the same Exposure Detection, only the genuine upload is attempted. Both TOD and dTOD are updated, as would be the case if the upload of the dummy data were actually attempted. This is important because two dummy uploads cannot be attempted after the same Exposure Detection, and the Mobile Client performs only one Exposure Detection each time it activates in the background. Hence, if an attacker were to observe that two uploads had been attempted at roughly the same time, they would know that at least one of them was genuine.

When uploading data to the Analytics Service, the Mobile Client sets a dedicated header in the request. This lets the Analytics Service know whether the payload needs to be discarded (if it was a dummy one) or not (if it was a genuine one). The Exposure Ingestion Service waits for a random amount of time (picked from an exponential distribution) before it responds to a dummy request, to simulate the response time of a genuine request.

To make the dummy traffic indistinguishable from genuine analytics traffic while in transit (by which stage it is encrypted), we adopt the following measures:

  • All requests—regardless of whether they are genuine or dummy—are made to the same IP address resolved by the same domain.
  • The request header used to indicate whether the data are genuine or dummy is always set and its length is the same in both cases.
  • The size of the payload of dummy requests is the same as that of genuine ones. This is easy to achieve, because genuine analytics payloads have a fixed size, as explained earlier.

Configurability

All of the parameters that allow for the tuning of the traffic-analysis mitigation measures described above can be changed from the server side. This is achieved by modifying the Configuration Settings served by the App Configuration Service.

These parameters include the following (sorted by the order in which they appear above):

  • E(dTOD). The mean (measured in seconds) of the exponential distribution determining how frequently dummy analytics uploads occur.
  • nSUM. The maximum number of Exposure Detection Summaries a single data upload request can have.
  • nINF. The maximum number of Exposure Info entries a single data upload request can have.
  • bEIS. The packet size (measured in bytes) of any request to the Exposure Ingestion Service.
  • E(dTTR). The mean (measured in seconds) of the exponential distribution determining the length of the delay between one request’s completion or timeout and the next request attempt in a simulated TEK upload sequence.
  • pT. An array of the probabilities (float, ranging between 0 and 1) of the Mobile Client attempting a new request within a simulated TEK upload sequence after a request has been sent and a response has been received or a timeout has occurred. pT(i) represents the probability of the Mobile Client continuing after the (i+1)th request. If N is the length of the array, pT(N-1) is used by the Mobile Client to determine whether or not to make any attempt following the Nth.
  • E(dTT,iOS). The mean (measured in seconds) of the exponential distribution determining how frequently simulated TEK upload sequences occur on iOS.
  • WT,iOS. The duration (in seconds) of the opportunity window to simulate TEK upload sequences on iOS.
  • E(dTTS,iOS). The mean (measured in seconds) of the exponential distribution determining the length of the delay between the start of a foreground session and the start of a simulated TEK upload sequence scheduled to occur within that session on iOS.
  • E(dTT,And). The mean (measured in seconds) of the exponential distribution determining how frequently simulated TEK upload sequences occur on Android.