PseudoDeviceAddress linking due to insecure random #110

jimmo · 2020-12-17T03:11:00Z

I missed this originally in the note in #109 (comment) which suggested that the unit test would be updated imminently, but #108 adds a new random source. FYI @c19x (author) and @adamfowleruk (merge)

The default source (enabled in BLESensorConfiguration.java with a comment that would strongly suggest that an implementor uses it) uses java.util.Random() and Math.random(). As as the case in #90, an insecure random number generator cannot be used here as it leads to predictable PseudoDeviceAddresses.

I am raising this issue here because this needs to be fixed in case implementors such as COVIDSafe pick up this change.

The comment in PseudoDeviceAddress says:

In short, an attacker will need to establish the seed of Math.random() based on
limited observations of truncated values where any number of values and bits could
have been skipped between observations and from initialisation. While cracking the
seed of Random is trivial when the value index is known (i.e. it simply confirms
Random value sequence is deterministic, given the seed, as per Java documentation),
it is significantly more challenging when the position data is unknown, and information
has been discarded (truncated) in the process, especially when the position is in
essence selected by a highly random process.

This analysis is not correct. See below.

I do not understand the issue with SecureRandom -- a static shared instance of SecureRandom is good enough for java.util.UUID's randomUUID() method -- however I do not doubt that anything could be possible with old enough phones. It's a shame that there isn't a simple way to just use non-blocking /dev/urandom directly. It's worth re-iterating though that supporting Android 5 and 6 is a liability for other security/privacy reasons, however assuming SecureRandom is off the table, here are some suggestions.

Your problem here is not actually a requirement for an entropy source. You just need an internal state for a random generator that is large enough to resist brute force attacks, and a cryptographically secure way of advancing it. The problem with java.util.Random's LCG is that neither of these things are true.

I am hesitant to make a more specific suggestion as I am not a cryptographer, please consult a professional cryptographer before implementing anything. However any stream cipher would likely achieve this goal. You need a once-off entropy source to generate the IV/key and then you have as much random data as you need. See also the HOTP algorithm used for authentication tokens which also solves a similar problem (generating a secure random sequence from a hidden seed) -- https://en.wikipedia.org/wiki/HMAC-based_One-time_Password_algorithm.

FYI you asked about BLE RPA generation, this is typically implemented on the HCI Controller, which will have hardware random capabilities. There's an HCI command for the host to request random data from the HCI Controller. (Obviously this isn't useful here).

Additionally, as mentioned in #109 the testRandom test case is not a suitable or useful way of verifying this implementation and is just likely to lead to very occasional flakes. A slightly-modified counter would pass this test.

Analysis of the RandomSource approach from #108 ..

The goal of an attacker is to be able to answer the question "did both of these PseudoDeviceAddresses come from the same device". So for example, if you could calculate the internal seed of Math.random() corresponding to each address, then you could see whether one could be reached from the other in some bounded number of steps.

The algorithm implemented goes:

Advance Math.random() by up to 128 steps (based on time).
Use Math.random() to seed a new temporary LCG.
Use Math.random() again to advance that LCG by 256-1280 steps.
Use additional entropy to advance that LCG again by up to 128 steps.
Use that LCG to generate a 6 byte address.

The first step doesn't achieve anything -- if I have two seeds from Math.random() it just means I might have to advance them further to see if they are reachable from each other.

As in #90, step 5 gives us the internal seed of the temporary LCG at the time used to generate the address.

Currently the additional entropy isn't used, so step 4 doesn't apply. (But I'll address that below)

We don't know how many steps to walk the internal seed backwards from step 3, but we can try all possible 1024 values. We convert the internal seed of the LCG at each point back to the double that would have been returned from Math.random() in step 2, and verify that this double is actually a valid one for this particular LCG multiplier/addend. This gives us a candidate for the internal seed of Math.random(). Because of the way step 2 is implemented, approximately 10 bits of information is lost, so there will be multiple candidates. However, we can then see what value for "skip" the next Math.random() value would have given us, and see if it matches the skip value we are currently testing.

As before, this is not a hypothetical attack -- here is a proof of concept implemented in C. https://gist.github.com/jimmo/59d363a0077fc7dcdd7615ca5835d1f5 It takes a few minutes to run.

source: global_seed=7 local_seed=154095278931565 skip=1131
Observed address: 2a:32:dd:c2:dc:e8
addr_seed: 254707077139638
found: global_seed=251709487352358 local_seed=154095278931565 skip=1131
found: global_seed=7 local_seed=154095278931565 skip=1131

It generates an address using a "hidden" internal Math.random seed based on the algorithm above, then from that address works backwards to calculate two possible guesses for the global seed (the second guess is correct). With two addresses you can now verify that these two seeds are nearby in sequence.

The additional entropy in step 4 does help a little bit, however what we now end up with is a couple of hundred possible global seeds for a given address. We can just verify that none of them reach any of the seeds from the second address, and with very high probability we can still say they are from the same phone.

There are many ways in which you could actually use Math.random() in a relatively more secure way, but please just use a cryptographic technique instead.

The text was updated successfully, but these errors were encountered:

vteague · 2020-12-17T03:34:21Z

+1 on please not just re-doing the invention of your own. Please understand that if this is about to be rolled out to millions of Australians, you have a responsibility to get it right, not merely to make up something you can't see a problem with.

adamfowleruk · 2020-12-17T12:04:41Z

Hello @jimmo and @vteague - The code committed is to the develop branch, not the master branch, and I can assure you will not be released to downstream projects until we have a workable, secure and reliable alternative that works across the devices we support.

We've committed these changes to our develop branch with a range of random data generation options - not just reverting to our old one - precisely so we can test these alternatives wider amongst more than just a single developer, and elicit feedback from the community. So thanks for engaging! You just beat me to putting a page on our website asking for feedback on the issues we've found, and the data behind them.

If its ok with you both, rather than repeat what I shall publish in that webpage here, I'll go and finish that page over the next day or so and then paste the link here. I'll then open a 'help wanted' issue on GitHub for this item, as I was going to do, so as not to confuse the discussion.

I shall of course leave your reported issue open until the chosen approach is decided upon. As ever, please do continue to provide advice, feedback, and ideas! ❤️

Related to theheraldproject#107. Potential workaround for Exynos gatt server implementation - Existing gattServer instances on devices with the Exynos base chipset do not remove adverts from the same source app - Affects Exynos version of some Samsung handsets, but not the more prevalent SnapDragon versions - Only causes an issue in testing if Bluetooth is turned on/off 23 times - a rare occurrence in normal usage - Without the fix Herald would still allow an affected device to interact with other devices (iOS and Android) thanks to the 'calling card' / write characteristic - Does not cause an issue for detection unless write characteristic/calling card is disabled (Not done in the Herald code base or library) - Issue theheraldproject#110 also occurs on Exynos based handsets only - Committing to develop to allow third party ISV testing today Signed-off-by: Adam Fowler <adamfowleruk@gmail.com>

adamfowleruk · 2020-12-21T17:52:55Z

@vteague @jimmo Hello both. I've now added a community helps page to the website:-

https://vmware.github.io/herald/community

I've also created a help wanted issue for this item. New issue link: #118

As mentioned before, I'll leave this item open until the project finds a suitable approach as per #118.

Thanks for getting involved!

jimmo · 2021-05-15T14:23:13Z

The code committed is to the develop branch, not the master branch, and I can assure you will not be released to downstream projects until we have a workable, secure and reliable alternative that works across the devices we support.

Please see abopengov/contact-tracing-Android#10

adamfowleruk · 2021-05-17T08:49:04Z

@jimmo Can you please provide exploit code on Android in Java if you have it? I'm aware there was a C version but this of course isn't the same platform or API we're using. Important to be able to reproduce the predictability so I can create a unit test if it's reproducible on one of our supported platforms. Thanks! (Happy to pair on this on zoom sometime if that's easier)

jimmo · 2021-05-17T11:36:32Z

Can you please provide exploit code on Android in Java if you have it?

I only have it in C. Here is the updated version that I used to test the ABTT addresses.

https://gist.github.com/jimmo/ed3da3b0796ebb2fa79a51f54eb74e76

It finds that two addresses from the same phone were 43 steps of the Math.random sequence apart . They happen to be from addresses that were observed 15 minutes apart, which is why this number is low, but any small (i.e <= 6 digit) number would tell you with high confidence they came from the same phone within a few days.

Starting address: 6c:8d:98:ba:83:50
Candidate local_seed=93693386181415 scrambled=93679409474890
  > Candidate global_seed=15178996699414 local_seed=149016202772077 skip=1114
  > Candidate global_seed=44944486057719 local_seed=149016202772077 skip=1114
Candidate local_seed=159950508064846 scrambled=159964980831779
Candidate local_seed=226207629948277 scrambled=226197934002968
  > Candidate global_seed=261435794813553 local_seed=5638652188269 skip=1192
    > Found match after 43 steps.
  > Candidate global_seed=104089758379344 local_seed=5638652188269 skip=1192
Candidate local_seed=10989775121052 scrambled=10974166289649

Important to be able to reproduce the predictability so I can create a unit test if it's reproducible on one of our supported platforms.

Not sure I follow. The issue here is that you have a predictable sequence generator with 48 bits of state, and it leaks (approximately) 48 bits of state every 15 minutes.

There's no unit test to write, unless you want to write one that immediately fails. But you already have both a detailed explanation and C code showing that it fails. And showing that it no longer fails if you change the algorithm doesn't tell you anything else, because this exploit is tuned exactly to the current implementation -- any minor change will render it useless.

Related -- there's no way to write a unit test that shows that whatever implementation you choose is secure in general. All you can do is write a test to show that it's insecure in a very specific way, which is not useful. As you recently noticed with aaeb44d it's just a recipe for false positives and flakey tests.

adamfowleruk · 2021-05-31T08:22:44Z

A secure approach that also supports the older devices this project requires has now been contributed as part of #190. I'm therefore closing this issue.

adamfowleruk self-assigned this Dec 17, 2020

adamfowleruk added the investigate Investigate the potential issue to confirm label Dec 17, 2020

adamfowleruk added this to the v1.2 Wider ecosystem adoption milestone Dec 17, 2020

adamfowleruk mentioned this issue Dec 17, 2020

Potential workaround for Exynos gatt server implementation #112

Merged

adamfowleruk added duplicate This issue or pull request already exists and removed investigate Investigate the potential issue to confirm labels Dec 22, 2020

adamfowleruk modified the milestones: v1.2 Wider ecosystem adoption, v1.3 Better data analysis Jan 28, 2021

adamfowleruk modified the milestones: v1.3 Better data analysis, v2.0 Move to LFPH May 2, 2021

jimmo mentioned this issue May 15, 2021

Herald: Insecure random leads to device tracking abopengov/contact-tracing-Android#10

Open

c19x mentioned this issue May 24, 2021

RandomSource update - Validated #190

Merged

adamfowleruk closed this as completed May 31, 2021

adamfowleruk mentioned this issue May 31, 2021

Create Security and Privacy guides on Website theheraldproject/theheraldproject.github.io#75

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PseudoDeviceAddress linking due to insecure random #110

PseudoDeviceAddress linking due to insecure random #110

jimmo commented Dec 17, 2020

vteague commented Dec 17, 2020

adamfowleruk commented Dec 17, 2020

adamfowleruk commented Dec 21, 2020

jimmo commented May 15, 2021

adamfowleruk commented May 17, 2021 •

edited

Loading

jimmo commented May 17, 2021

adamfowleruk commented May 31, 2021

PseudoDeviceAddress linking due to insecure random #110

PseudoDeviceAddress linking due to insecure random #110

Comments

jimmo commented Dec 17, 2020

vteague commented Dec 17, 2020

adamfowleruk commented Dec 17, 2020

adamfowleruk commented Dec 21, 2020

jimmo commented May 15, 2021

adamfowleruk commented May 17, 2021 • edited Loading

jimmo commented May 17, 2021

adamfowleruk commented May 31, 2021

adamfowleruk commented May 17, 2021 •

edited

Loading