Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(deps): Upgrade to Rust 1.50.0 #6428

Merged
merged 17 commits into from
Feb 18, 2021
Merged

chore(deps): Upgrade to Rust 1.50.0 #6428

merged 17 commits into from
Feb 18, 2021

Conversation

jszwedko
Copy link
Member

@jszwedko jszwedko commented Feb 12, 2021

I tried to break up the commits to make the changes easier to follow.

The bulk of the changes were from the new clippy::unnecessary_wraps rule. I was a bit on the fence about that rule, but I've come around to the idea that it's better to push wrapping up to the callers to make it more clear that a function only every returns one of the Option or Result enum values. EDIT I'm back on the fence on this lint again. I think it hurts readibility in many cases. I'm curious to get other opinions though. I separated the commits so hopefully it is easy to back out just that lint.

I also ignored some of the lints on the remap parsing code as I expect that to go away, but I'll have to confirm.

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
I believe this is the equivalent compare_exchange based on the table in
the docs:

Original Success  Failure
Relaxed  Relaxed  Relaxed
Acquire  Acquire  Acquire
Release  Release  Relaxed
AcqRel   AcqRel   Acquire
SeqCst   SeqCst   SeqCst

https://doc.rust-lang.org/std/sync/atomic/struct.AtomicUsize.html#migrating-to-compare_exchange-and-compare_exchange_weak

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
@jszwedko jszwedko requested review from a team, lukesteensen, leebenson, eeyun, StephenWakely and juchiast and removed request for a team February 12, 2021 15:11
@@ -69,13 +69,13 @@ fn gauge_add(gauge: &AtomicUsize, add: isize, emitter: impl Fn(usize)) {
emitter(new_value);
// Try to update gauge to new value and releasing writes to gauge metric in the process.
// Otherwise acquire new writes to gauge metric.
let latest = gauge.compare_and_swap(value, new_value, Ordering::AcqRel);
if value == latest {
value = match gauge.compare_exchange(value, new_value, Ordering::AcqRel, Ordering::Acquire)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bruceg I'd appreciate your 👀 here. compare_and_swap was deprecated. I've swapped it for the equivalent compare_exchange call, but there is also a compare_exchange_weak call that can purportedly generate more efficient assembly. The difference is:

Unlike AtomicUsize::compare_exchange, this function is allowed to spuriously fail even when the comparison succeeds, which can result in more efficient code on some platforms. The return value is a result indicating whether the new value was written and containing the previous value.

The nuance being that it might sometimes fail even if it could have succeeded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compare_exchange_weak will play much better on ARM and we should aim to use it. compare_exchange emits instructions that require strong ordering of memory. A strong ordering means atomic operations are implicit acquire/release and probably Total Sorted Order (every CPU agrees about the order memory operations took place) where weak ordering allows CPUs to disagree, which is cool because some of them can shut off if need be or do other stuff. (ARM is weak, data-dependent which means every CPU eventually agrees on the results of operations and A->B dereference is guaranteed that A is at least as fresh as B. When systems don't support data-dependency things get bonkers.)

For x86 with implicit strong order we'll never see additional fail loops, if we were an omniscient observer of the system. On ARM it's possible that we will because the CPU our instructions run on is lagging getting updates from other CPUs but it's an acceptable trade-off except in specialized circumstances.

I also suggest using ordering semantics AcqRel for success here, like you've done, and Relaxed for failure. On ARM this'll mean more potential spins but overall more efficient results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went ahead and made myself #6477 to tackle this.

Copy link
Member Author

@jszwedko jszwedko Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed response! I think what you said makes sense and agree with the AcqRel for success and Relaxed for failure. I'll leave that to your PR since this one maintains the current state.

On ARM it's possible that we will because the CPU our instructions run on is lagging getting updates from other CPUs but it's an acceptable trade-off except in specialized circumstances.

To deepen my understanding, under what circumstances would it be unacceptable?

I did note that the Rust PR for deprecating compare_and_swap only introduced one compare_exchange_weak:

rust-lang/rust#79261

The PR comment seems to indicate that they were mostly just going for satisfying the deprecation rather than with an eye to improving performance though.

Copy link
Contributor

@blt blt Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed response! I think what you said makes sense and agree with the AcqRel for success and Relaxed for failure. I'll leave that to your PR since this one maintains the current state.

Cool. I'm good with that.

On ARM it's possible that we will because the CPU our instructions run on is lagging getting updates from other CPUs but it's an acceptable trade-off except in specialized circumstances.

To deepen my understanding, under what circumstances would it be unacceptable?

Solid question. Consider a situation where you execute some expensive function per loop, maybe it has a non-trivial side-effect or is expensive in CPU terms but you can't migrate it out of the loop. You might be better off paying the higher synchronization cost here, depending on what your goals are.

The PR comment seems to indicate that they were mostly just going for satisfying the deprecation rather than with an eye to improving performance though.

Quite right. Huh. Well, I was going off my own background understanding; ARM'll get better instructions emitted if we use compare_exchange_weak. If you take this program

use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::Arc;

pub fn mul(val: &Arc<AtomicU64>, factor: u64) {
    let mut old = val.load(Ordering::Relaxed);
    loop {
        let new = old * factor;
        match val.compare_exchange_weak(old, new, Ordering::AcqRel, Ordering::Relaxed) {
            Ok(_) => break,
            Err(x) => old = x,
        }
    }
}

pub fn main() {
    let val = Arc::new(AtomicU64::new(4));
    mul(&val, 8);
    println!("{:?}", val);
}

and generate x86 you'll see a lock cmpxchg instruction pair whether you use compare_exchange_weak or compare_exchange. On ARM when I compile for arm-unknown-linux-gnueabihf at opt-level 3 the _weak variant of the program clobbers less registers, has fewer exclusive loads and does less conditional work. I actually find opt-level 3 hard to read -- I think the compiler noticed all my values are powers of 2 and did clever things from that -- and fortunately when I call opt-level 0 it embeds the assembly of compare_exchange_weak and compare_exchange. The strong version has more callee-save registers than I understand and calls further into helper functions that aren't inlined. The compiler source calls out to intrinsics, which makes sense, but without digging in further and brushing up on ARM -- which I need to do eventually -- I think the best I can offer is the strong variant emits more instructions to force coordination with the other CPUs. An ARM expert could pin point exactly where in the assembly for the above this happens, and if someone out there reading this knows I'd love to get taught. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid question. Consider a situation where you execute some expensive function per loop, maybe it has a non-trivial side-effect or is expensive in CPU terms but you can't migrate it out of the loop. You might be better off paying the higher synchronization cost here, depending on what your goals are.

👍 Yeah that makes sense as a trade-off. Thanks again for the thorough explanation!

@jszwedko jszwedko removed the request for review from a team February 12, 2021 15:16
@@ -23,6 +23,7 @@ impl SplitFn {
}

impl Function for SplitFn {
#[allow(clippy::collapsible_match)] // I expect this file to be going away shortly
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeanMertz @FungusHumungus is this accurate? It seems like src/mapping is largely vestigial at this point. Do you think we'd drop it as part of #6353 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, feel free to ignore this. I'll drop it once #6353 makes it into master (later today, as the CI is expected to be green within the hour).

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
@jszwedko jszwedko requested a review from a team February 12, 2021 18:01
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Copy link
Member

@lukesteensen lukesteensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! I'd say I'm in favor of keeping the unnecessary_wraps changes. It seems good to simplify the functions and not imply some kind of conditional that doesn't actually exist. Wrapping the return value at the call site that needs it to be wrapped seems slightly more intentional.

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Copy link
Contributor

@blt blt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I echo the support for unnecessary_wraps. My comment here is not blocking and I'd be happy to see a different PR that audits our codebase's atomics. Something I could take on or pair up with you to do.

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Comment on lines -450 to +451
type Item = Spanned<'input, usize>;
type Item = SpannedResult<'input, usize>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes needed for the 1.50 upgrade, or did they too sneak in from a VRL merge?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These ones are intentional. Clippy was complaining about unnecessary_wraps so I renamed the existing Spanned type SpannedResult and added a new Spanned type that represents the Ok value of SpannedResult so I could easily refer to Spanned in the function return types where it was complaining about the unnecessary wrapping. For example: https://github.com/timberio/vector/pull/6428/files/f279014f63a058abe29e0afb8dbc1f1dffc2fb32#diff-3eaef579e2d938c6926fa17614f80aab7455adeeffd8907567b34fee18a07e94R575-R582

@jszwedko jszwedko merged commit 5f6e07e into master Feb 18, 2021
@jszwedko jszwedko deleted the upgrade-1.50.0 branch February 18, 2021 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants