Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't trust reported last seen times #2178

Merged
merged 18 commits into from
Jun 1, 2021

Conversation

jvff
Copy link
Contributor

@jvff jvff commented May 21, 2021

Motivation

Zebra would previously trust the last_seen time reported on a gossiped peer. However, if that time is in the future, it could take precedence over all other peers in the reconnection queue. This could happen intentionally by a malicious peer or unintentionally due to clock skew. The reporting issue (#1871) has more information.

Solution

The proposed solution is to apply an offset to the reported last_seen times by a peer if any of them are in the future. To do this, the offset is calculated by subtracting the most advanced reported time (in the remote peer's clock) by the current time (in the local clock).

The code in this pull request has:

  • Documentation Comments
  • Unit Tests and Property Tests

Review

@teor2345 wrote the proposed solution in the original issue.

Related Issues

Closes #1871

Follow Up Work

@jvff jvff requested a review from teor2345 May 21, 2021 01:43
@zfnd-bot zfnd-bot bot assigned jvff May 21, 2021
@jvff jvff force-pushed the dont-trust-reported-last-seen-times branch from fab6329 to c33660e Compare May 21, 2021 02:27
@jvff jvff force-pushed the dont-trust-reported-last-seen-times branch 2 times, most recently from 19e2605 to 73b0df4 Compare May 21, 2021 23:28
@jvff jvff marked this pull request as ready for review May 21, 2021 23:42
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few significant changes:

  1. If any times are in the future, let's adjust all times, so we handle honest clock skewed peers correctly
  2. Let's test peers that are exactly at the limit
  3. Let's test using random limits, not just the current time when the test is running

I also have some general feedback about formatting and refactoring.

zebra-network/src/peer_set/candidate_set.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set.rs Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set/tests/prop.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set/tests/prop.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set/tests/vectors.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comment tweaks, nothing major

zebra-network/src/meta_addr.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set/tests/vectors.rs Outdated Show resolved Hide resolved
@jvff jvff force-pushed the dont-trust-reported-last-seen-times branch 3 times, most recently from f37fb4e to c1e3fe5 Compare May 25, 2021 01:46
@jvff jvff enabled auto-merge (rebase) May 25, 2021 02:16
@jvff jvff force-pushed the dont-trust-reported-last-seen-times branch from c1e3fe5 to 50ed84b Compare May 25, 2021 23:45
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests are failing because I added leap seconds to the Arbitrary methods in a recent PR: #2195

I've made some suggestions to fix that. (Zebra shouldn't crash if it's running during a leap second. But it's ok for the tests to maybe fail, because leap seconds are rare.)

Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, the core code is really solid.

We need to do a few other fixes due to existing MetaAddr bugs. And I'd like to simplify the error handling in this PR.

My Leap Seconds PR

This PR failed due to #2195, which adds arbitrary leap seconds. (Sorry about that, I didn't expect it to make any difference.) So I made a fix in jvff#1

Existing MetaAddr Bugs

I also added a bunch of other properties we should test in jvff#1, but we need to merge #2203 to fix some MetaAddr serialization bugs first. Feel free to rebase and re-enable those tests.

Simplifying Error Handling in this PR

I'd like to re-design how we do error handling. We're handling errors in a few different places. So it would be easy to miss some, particularly as validation gets more complicated. I'd like to use idiomatic Rust error-handling, where each function returns a Result.

So the functions in this PR would look like:

struct MetaAddrValidationError;

fn MetaAddr::offset_last_seen_by(&self, offset: Duration) -> Result<MetaAddr, MetaAddrValidationError>

fn limit_last_seen_times(addrs: &Vec<MetaAddr>, last_seen_limit: DateTime<Utc>) -> Result<Vec<MetaAddr>, MetaAddrValidationError>

fn validate_addrs(addrs: impl IntoIterator<Item = MetaAddr>, last_seen_limit: DateTime<Utc>) -> Result<Vec<MetaAddr>, MetaAddrValidationError>

That way, we can propagate errors up to update_fanout using the try operator ?, and handle them in one place. (By not sending the addresses to the address book.)

I think we'll end up with simpler code that way. We can use checked functions for all our calculations, and use ? to return any errors.

(We might need to impl From<None> or From<NoneError> for MetaAddrValidationError, let me know how you go.)

Let me know what you think - we can do some pair programming on these changes if you'd like.

zebra-network/src/peer_set/candidate_set.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set/tests/vectors.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set.rs Outdated Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set.rs Outdated Show resolved Hide resolved
jvff added 3 commits May 31, 2021 13:49
Returning `impl IntoIterator` means that the caller will always be
forced to call `.into_iter()`, and returning `impl Iterator` still
allows them to call `.into_iter()` because it becomes the identity
function.
Will be used when limiting the reported last seen times for recived
gossiped addresses.
jvff and others added 14 commits May 31, 2021 13:57
Due to clock skew, the peers could end up at the front of the
reconnection queue or far at the back. The solution to this is to offset
the reported times by the difference between the most recent reported
sight (in the remote clock) and the current time (in the local clock).
- Make the security impact clearer and in a separate section.
- Instead of listing an assumption as almost a side-note, describe it
  clearly inside a `Panics` section.

Co-authored-by: teor <teor@riseup.net>
Times in the past don't have any security implications, so there's no
point in trying to apply the offset to them as well.
If any of the times gossiped by a peer are in the future, apply the
necessary offset to all the times gossiped by that peer. This ensures
that all gossiped peers from a malicious peer are moved further back in
the queue.

Co-authored-by: teor <teor@riseup.net>
Make it clear why all peers have the time offset applied to them.

Co-authored-by: teor <teor@riseup.net>
If an overflow occurs, the reported `last_seen` times are either very
wrong or malicious, so reject all addresses gossiped by that peer.
Focus on what can go wrong, and not on the specific causes.

Co-authored-by: teor <teor@riseup.net>
The `limit_last_seen_times` can now safely handle an empty list.
Use some mock gossiped peers that all have `last_seen` times in the
future and check that they all have a specific offset applied to them.
Use some mock gossiped peers that all have `last_seen` times in the
past and check that they don't have any changes to the `last_seen` times
applied by the `validate_addrs` function.
Use some mock gossiped peers where some have `last_seen` times in the
past and some have times in the future. Check that all the peers have
an offset applied to them by the `validate_addrs` function.

This tests if the offset is applied to all peers that a malicious peer
gossiped to us.
If the most recent `last_seen` time reported by a peer is exactly the
limit, the offset doesn't need to be applied because no times are in the
future.
If the calculation to apply the compensation offset overflows or
underflows, the reported times are too distant apart, and could be sent
on purpose by a malicious peer, so all addresses from that peer should
be rejected.
Given a generated list of gossiped peers, ensure that after running the
`validate_addrs` function none of the resulting peers have a `last_seen`
time that's after the specified limit.
@jvff jvff force-pushed the dont-trust-reported-last-seen-times branch from b19043c to 15a8ff0 Compare May 31, 2021 14:51
@jvff jvff requested a review from teor2345 June 1, 2021 00:58
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great.

I'll just commit some typo fixes, and then squash the PR.

Comment on lines +241 to +244
/// Set the last time we interacted with this peer.
pub(crate) fn set_last_seen(&mut self, last_seen: DateTime32) {
self.last_seen = last_seen;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love simple code

Comment on lines +386 to +394
// No overflow is possible, so apply offset to all addresses
for addr in addrs {
let old_last_seen = addr.get_last_seen().timestamp();
let new_last_seen = old_last_seen - offset;

addr.set_last_seen(new_last_seen.into());
}
} else {
// An overflow will occur, so reject all gossiped peers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor typo fix:

Suggested change
// No overflow is possible, so apply offset to all addresses
for addr in addrs {
let old_last_seen = addr.get_last_seen().timestamp();
let new_last_seen = old_last_seen - offset;
addr.set_last_seen(new_last_seen.into());
}
} else {
// An overflow will occur, so reject all gossiped peers
// No underflow is possible, so apply offset to all addresses
for addr in addrs {
let old_last_seen = addr.get_last_seen().timestamp();
let new_last_seen = old_last_seen - offset;
addr.set_last_seen(new_last_seen.into());
}
} else {
// An underflow will occur, so reject all gossiped peers

/// Ensure all reported `last_seen` times are less than or equal to `last_seen_limit`.
///
/// This will consider all addresses as invalid if trying to offset their
/// `last_seen` times to be before the limit causes an overflow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// `last_seen` times to be before the limit causes an overflow.
/// `last_seen` times to be before the limit causes an underflow.

Comment on lines +379 to +393
// If any time is in the future, adjust all times, to compensate for clock skew on honest peers
if newest_reported_seen_timestamp > last_seen_limit.timestamp() {
let offset = newest_reported_seen_timestamp - last_seen_limit.timestamp();

// Apply offset to oldest timestamp to check for underflow
let oldest_resulting_timestamp = oldest_reported_seen_timestamp as i64 - offset as i64;
if oldest_resulting_timestamp >= 0 {
// No overflow is possible, so apply offset to all addresses
for addr in addrs {
let old_last_seen = addr.get_last_seen().timestamp();
let new_last_seen = old_last_seen - offset;

addr.set_last_seen(new_last_seen.into());
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code is about as simple as it gets.

If we used chrono::DateTime, we would need check for other errors. (Or justify that they can't possibly happen.)

Good call here!

Comment on lines +11 to +21
fn no_last_seen_times_are_in_the_future(
gossiped_peers in vec(MetaAddr::gossiped_strategy(), 1..10),
last_seen_limit in any::<DateTime32>(),
) {
zebra_test::init();

let validated_peers = validate_addrs(gossiped_peers, last_seen_limit);

for peer in validated_peers {
prop_assert![peer.get_last_seen() <= last_seen_limit];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is about as good as a proptest ever gets.

I'm glad we made the DateTime32 change.

Comment on lines +371 to +377
let (oldest_reported_seen_timestamp, newest_reported_seen_timestamp) =
addrs
.iter()
.fold((u32::MAX, u32::MIN), |(oldest, newest), addr| {
let last_seen = addr.get_last_seen().timestamp();
(oldest.min(last_seen), newest.max(last_seen))
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example of implementing min and max using iterators:

    let timestamps: Vec<_> = addrs
        .iter()
        .map(|addr| addr.get_last_seen().timestamp())
        .collect();
    let oldest_reported_seen_timestamp = timestamps.iter().min().cloned().unwrap_or(u32::MAX);
    let newest_reported_seen_timestamp = timestamps.iter().max().cloned().unwrap_or(u32::MIN);

In this case the code is about the same length, so let's stick with your version.

By the way, the fold/unwrap_or constants are never actually used. They participate in the calculations, but they never get applied to any data, because the array is empty. But I think they're better than using the Options in later code.

@jvff jvff merged commit 34702f2 into ZcashFoundation:main Jun 1, 2021
@teor2345
Copy link
Contributor

teor2345 commented Jun 1, 2021

Ha, caught by the auto-merge. I'll just commit those comment changes to main.

@jvff jvff deleted the dont-trust-reported-last-seen-times branch June 10, 2021 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Security: Zebra should stop believing far-future last_seen times from peers
2 participants