Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::thread::LocalKeyState: Add state Initializing #43550

Closed
wants to merge 5 commits into from

Conversation

joshlf
Copy link
Contributor

@joshlf joshlf commented Jul 29, 2017

Closes #43491.

I know that the issue hasn't been discussed or approved yet, but I figured I'd make this PR to provide something concrete for the discussion.

Open question: This is my first serious PR, and I'm not sure what to do about my rustfmt automatically re-formatting existing code that I didn't change. Somebody's guidance here would be appreciated.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@sfackler sfackler added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jul 30, 2017
@alexcrichton
Copy link
Member

Thanks for the PR! Could the formatting related changes be removed from this PR for now?

@joshlf
Copy link
Contributor Author

joshlf commented Jul 30, 2017

@alexcrichton I'm not sure how to do that other than to remove them manually - is there a standard rustfmt configuration that I can set that will cause rustfmt to format code in libstd according to the style guidelines (and thus won't make any changes to existing code)? If not, I can just do it manually.

@carols10cents carols10cents added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jul 31, 2017
@alexcrichton
Copy link
Member

@joshlf can you disable auto-rustfmt in your editor, redo the changes, and resubmit them?

@joshlf
Copy link
Contributor Author

joshlf commented Aug 2, 2017

@alexcrichton Done.

@joshlf
Copy link
Contributor Author

joshlf commented Aug 2, 2017

I think it'd be good to modify AccessError to distinguish between its two causes - a key being in the Initializing state, and a key being in the Destroyed state. Thoughts on what a good API for that would look like? On the one hand, is_initializing and/or is_destroyed feels a bit unwieldy, but so does adding a new enum just for this (e.g., a state method returns enum AccessErrorState { Initializing, Destroyed }).

@joshlf
Copy link
Contributor Author

joshlf commented Aug 4, 2017

OK, should be ready to go. Removing the WIP tag.

@joshlf joshlf changed the title [WIP] std::thread::LocalKeyState: Add state Initializing std::thread::LocalKeyState: Add state Initializing Aug 4, 2017
@arielb1
Copy link
Contributor

arielb1 commented Aug 8, 2017

[00:45:59] thread 'main' panicked at 'Some tests failed', /checkout/src/tools/compiletest/src/main.rs:323:21
[00:45:59] ------------------------------------------
[00:45:59] error[E0004]: non-exhaustive patterns: `Initializing` not covered
[00:45:59]   --> /checkout/src/test/run-pass/tls-init-on-init.rs:40:19
[00:45:59]    |
[00:45:59] 40 |             match FOO.state() {
[00:45:59]    |                   ^^^^^^^^^^^ pattern `Initializing` not covered
[00:45:59] 

@joshlf
Copy link
Contributor Author

joshlf commented Aug 8, 2017

@eddyb Can you take a quick look at this and let me know if you think anything here conflicts with your PR? I don't think it does, but you know your PR better than I do, and I suspect yours will get merged first so I'll end up rebasing on top of it.

I suspect that all I'll need to do is mark some functions and methods unsafe as you have.

@eddyb
Copy link
Member

eddyb commented Aug 8, 2017

This is a lot of code. But I think all you have to check for if: if you take &'static self in std::thread::local::fast, you'll have to make that method take &self instead (a lifetime error will force you to do so anyway), and mark it unsafe.
Then, mark any wrappers created by the macro also unsafe if they call one of those unsafe methods, and finally, mark the fields in LocalKey unsafe where necessary.

@joshlf
Copy link
Contributor Author

joshlf commented Aug 8, 2017

OK sounds good; thanks!

@joshlf joshlf force-pushed the local-key-state branch 2 times, most recently from 1ca7c0a to 06834a9 Compare August 9, 2017 00:47
@joshlf
Copy link
Contributor Author

joshlf commented Aug 9, 2017

Looks like one of the CI tests failed due to network errors. Does anybody have the ability to tell Travis to retry? Looks like I don't.

remote: Counting objects: 660392, done.
remote: Compressing objects: 100% (123917/123917), done.
error: RPC failed; curl 56 SSLRead() return error -36
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
The command "eval git fetch origin +refs/pull/43550/merge: " failed 3 times.

@bors
Copy link
Contributor

bors commented Aug 12, 2017

☔ The latest upstream changes (presumably #43746) made this pull request unmergeable. Please resolve the merge conflicts.

@arielb1
Copy link
Contributor

arielb1 commented Aug 15, 2017

@JoshiF

Just do a trivial commit --amend (changing only the date) and push again. Though now that you have to rebase, it would be better to rebase and that will have another go at travis.

@joshlf
Copy link
Contributor Author

joshlf commented Aug 16, 2017

@eddyb Do you have any advice for how to get the lifetimes to work in this PR (in particular, in the __thread_local_inner macro)? The issue is that in the original code, there was only a single function that needed to be passed to LocalKey::new, so the #[thread_local] __KEY could be in the scope of that function's body. In this PR, there are multiple functions, so the key needs to be outside so it can be accessed from all of the functions.

The current version of the code won't compile because __KEY does not live long enough for method calls on it to return 'static references.

@eddyb
Copy link
Member

eddyb commented Aug 16, 2017

@joshlf The one function I had to deal with had the same problem, btw - look at how I changed fast::Key::get - specifically, unsafe and doesn't take 'static.

@bors
Copy link
Contributor

bors commented Aug 16, 2017

☔ The latest upstream changes (presumably #43710) made this pull request unmergeable. Please resolve the merge conflicts.

@eddyb
Copy link
Member

eddyb commented Aug 16, 2017

@joshlf Btw you should use git pull --rebase to not have merge commits in the PR.

@joshlf
Copy link
Contributor Author

joshlf commented Aug 16, 2017

@eddyb Do you know if there's a way to git pull --rebase retroactively?

@eddyb
Copy link
Member

eddyb commented Aug 16, 2017

@joshlf You can run it now and it should kill merge commits.

@joshlf joshlf force-pushed the local-key-state branch 3 times, most recently from 19d4391 to a96d10d Compare August 16, 2017 18:36
@joshlf
Copy link
Contributor Author

joshlf commented Sep 6, 2017

Hey @alexcrichton , can I ask you a question about some code that git blame says you wrote? In particular, this destructor for the OS implementation of TLS:

unsafe extern fn destroy_value<T: 'static>(ptr: *mut u8) {
        // The OS TLS ensures that this key contains a NULL value when this
        // destructor starts to run. We set it back to a sentinel value of 1 to
        // ensure that any future calls to `get` for this thread will return
        // `None`.
        //
        // Note that to prevent an infinite loop we reset it back to null right
        // before we return from the destructor ourselves.
        let ptr = Box::from_raw(ptr as *mut Value<T>);
        let key = ptr.key;
        key.os.set(1 as *mut u8);
        drop(ptr);
        key.os.set(ptr::null_mut());
    }

When you set the pointer back to null before returning, doesn't that make it so that a future access of the key after the destructor has run will cause the key to be re-initialized because a null pointer looks like an uninitialized key rather than a destroyed one? This could happen if the destructor of another TLS key accesses this key, and that destructor is called after this key's destructor.

@eddyb
Copy link
Member

eddyb commented Sep 6, 2017

@joshlf It's tri-state, IIRC they go like this (conceptually, the actual types might differ):

  • uninitialized: Some(None)
  • initialized: Some(Some(x))
  • destruction started: None

@joshlf
Copy link
Contributor Author

joshlf commented Sep 6, 2017

@eddyb I believe you're right, but my concern is what happens after the destruction has finished (after which, as the comment that I quoted says, the pointer is set back from 1 to 0). The implementation of get makes it seem that a call to get that takes place after the destructor has returned (and thus the pointer is again set to 0) would think that the key was simply not yet initialized, and would thus initialize it again:

        pub unsafe fn get(&'static self) -> Option<&'static UnsafeCell<Option<T>>> {
            let ptr = self.os.get() as *mut Value<T>;
            if !ptr.is_null() {
                if ptr as usize == 1 {
                    return None
                }
                return Some(&(*ptr).value);
            }

            // If the lookup returned null, we haven't initialized our own
            // local copy, so do that now.
            let ptr: Box<Value<T>> = box Value {
                key: self,
                value: UnsafeCell::new(None),
            };
            let ptr = Box::into_raw(ptr);
            self.os.set(ptr as *mut u8);
            Some(&(*ptr).value)
        }

@alexcrichton
Copy link
Member

@joshlf

When you set the pointer back to null before returning, doesn't that make it so that a future access of the key after the destructor has run will cause the key to be re-initialized because a null pointer looks like an uninitialized key rather than a destroyed one?

Indeed!

We don't have much of an option here, though, unless we track this data elsewhere. If we leave it as 1 then the OS will continue to call the destructor function rather than returning.

@joshlf
Copy link
Contributor Author

joshlf commented Sep 6, 2017

OK, in that case I think I'll clarify that in the documentation, and add a TODO in case somebody comes along and figures out how to fix it.

EDIT: I see now that this is already documented in the doc comment on LocalKey.

@joshlf joshlf force-pushed the local-key-state branch 5 times, most recently from 1a14f08 to 1dbbfd6 Compare September 7, 2017 02:34
@joshlf
Copy link
Contributor Author

joshlf commented Sep 7, 2017

OK, it's ready for review. Just to recap:

  • The new version explicitly does not attempt to move into state Initializing before performing any allocation (although, as written, it happens to do this for the fast implementation), and there is a warning about using TLS for global allocators in the documentation.
  • This version uses two callbacks: get and register_dtor. The first returns a &UnsafeCell<LocalKeyValue<T>>, and the second registers the destructor for the fast implementation (it's a no-op in the OS implementation).
  • The performance of this implementation matches the performance of the implementation on master (using the benchmark I provided above).

@bors
Copy link
Contributor

bors commented Sep 7, 2017

☔ The latest upstream changes (presumably #43931) made this pull request unmergeable. Please resolve the merge conflicts.

if let &LocalKeyValue::Valid(ref inner) = &*slot.get() {
// Do this in a separate if branch (rather than just part of the
// match statement in the else block) to increase the performance
// of the fast path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this measurably help the codegen?

I'd naively expect not, butI'm curious if you have benchmarks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm actually not sure. I tried benchmarking it, but I'm not confident that my xargo command is testing what I think it is because I ran it with a version of the code that had a compiler error, and it still ran properly, so I'm not sure what artifact xargo is picking up to link as libstd. Here's what I'm running:

RUST_BACKTRACE=1 RUSTFLAGS='-C opt-level=3' XARGO_RUST_SRC=/path/to/rust/src/ xargo run --release

Given this command, it looks like there's not much of a difference, but I'm skeptical for the reasons mentioned. If you want to give it a try yourself (since I assume you're better at this whole build pipeline thing), it would be give me more confidence.


// Register a destructor for the value. register_dtor should be called
// after the key has been transitioned into the Valid state.
register_dtor: unsafe fn(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this still happen on the first call to get? If initialization fails that means we'll just run the destructor on a noop value, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid adding an extra branch in the hot path, and that would require branching in get to see whether the destructor had been registered already.

// Set inner to Destroyed before we drop tmp so that a
// recursive call to get (called from the destructor) will
// be able to detect that the value is already being dropped.
ptr::write((*ptr).inner.get(), __LocalKeyValue::Destroyed);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mem::replace can be used here instead of read/write?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

let ptr = self.os.get() as *mut Value<T>;
if !ptr.is_null() {
if ptr as usize == 1 {
return None
// The destructor was already called (and set self.os to 1).
return &*(&self.dummy_destroyed as *const _);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come get no longer takes 'static self to require these casts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, honestly I think I wasn't sure what to do there so I just copied what mod fast does, but of course that doesn't make sense because mod os uses a truly static value. I'll change it back.

@alexcrichton
Copy link
Member

Hm ok so looking into the perf impact here, I'm getting more and more wary about this change...

Right now the try_with function can't be inlined presumably because it's recursive, so there's an unconditional function call on calls to with where there isn't today. Using just a bare match and fixing the recursion produces an almost optimal set of IR, except that now all calls to with are unconditionally flagged as "this can panic" because a 'being initialized' stat can get encountered, triggering the panic.

Today, however, calls to with only panic if the closure panics (in the optimal fast case), and it appears everything is otherwise inlined.

In other words, I think that fundamentally adding an "initializing" state is going to come at a perf hit to all thread locals. With that in mind, I wonder if there are perhaps other methods to tackle this for your use case? I've always imagined that thread_local! is at a much higher level of abstraction that a global allocator because thread_local! itself will allocate internaly.

@joshlf
Copy link
Contributor Author

joshlf commented Sep 7, 2017

Today, however, calls to with only panic if the closure panics (in the optimal fast case), and it appears everything is otherwise inlined.

Can't with already panic (in today's nightly) if the key is in state Destroyed? The current source of with:

    pub fn with<F, R>(&'static self, f: F) -> R
                      where F: FnOnce(&T) -> R {
        self.try_with(f).expect("cannot access a TLS value during or \
                                 after it is destroyed")
    }

With that in mind, I wonder if there are perhaps other methods to tackle this for your use case? I've always imagined that thread_local! is at a much higher level of abstraction that a global allocator because thread_local! itself will allocate internally.

I already decided to give up on making this work for allocation (see this comment). However, this is still useful for other recursive initialization dependencies.

@alexcrichton
Copy link
Member

I commented awhile ago with this example which generates this IR for a FOO.with(|a| *a) where a is just a Cell<u32>. There aren't any panics there.

In general yes, with can panic, but it requires destructors, I believe, which aren't always set in some cases.

I already decided to give up on making this work for allocation (see this comment). However, this is still useful for other recursive initialization dependencies.

In that case I'm somewhat inclined to close this? This PR isn't needed for you use case, is technically a regression in behavior, and is currently a regression in performance.

@joshlf
Copy link
Contributor Author

joshlf commented Sep 7, 2017

OK, fair enough. I'll open a PR to clarify that TLS constructors cannot depend recursively on the values that they construct, but other than the case of allocators (and perhaps some weird dependency injection scenarios), this should only arise within a given crate, in which case it's more reasonable to ask authors to just architect their stuff differently.

@joshlf
Copy link
Contributor Author

joshlf commented Sep 7, 2017

PR added: #44396

@alexcrichton
Copy link
Member

Ok, closing in favor of #44396 in that case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants