Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove futures_cpupool in favor of tokio #7

Closed
wants to merge 3 commits into from

Conversation

jonhoo
Copy link

@jonhoo jonhoo commented Sep 30, 2018

Previously, a dedicated cpupool was spun up to run hashing and verification in a non-blocking fashion. This is unfortunate given that there is usually already a threadpool running: the Tokio threadpool. This patch changes the code to run non-blocking hashing and verification on tokio's threadpool instead, to avoid configuring and creating a separate one. It uses tokio_threadpool::blocking to ensure that it does not hold up other Futures while doing so.

Note that this removes the ability to configure the number of threads used to perform non-blocking compute, and instead places that responsibility on whomever sets up the tokio::Runtime. This is probably better, but does change the API. Furthermore, the non_blocking methods now only work under tokio (actix uses tokio btw), and will not work in non-tokio asynchronous deployments (although those should be rare).

Previously, a dedicated cpupool was spun up to run hashing and
verification in a non-blocking fashion. This is unfortunate given that
there is usually already a threadpool running: the Tokio threadpool.
This patch changes the code to run non-blocking hashing and verification
on the tokio threadpool instead, to avoid configuring and creating a
separate one. It uses tokio_threadpool::blocking to ensure that it does
not hold up other Futures while doing so.

Note that this removes the ability to configure the number of threads
used to perform non-blocking compute, and instead places that
responsibility on whomever sets up the `tokio::Runtime`. This is
*probably* better, but does change the API. Furthermore, the
`non_blocking` methods now _only_ work under `tokio` (actix uses tokio
btw), and will not work in non-tokio asynchronous deployments (although
those should be rare).
@michiel-de-muynck
Copy link

I'm not the author of this crate, but I do have some concerns over this PR:

  • It makes argonautica unconditionally depend on Tokio. That's a very heavy dependency, pulling in all kinds of code (tokio-tcp, tokio-udp, tokio-fs, ...) that is completely unrelated to hashing a password. Futures-cpupool on the other hand is very light. A feature-gate or making the Tokio dependency optional would be better.

  • non-tokio asynchronous deployments [...] should be rare

    I don't know if I agree with that. Command-line tools or desktop password managers may want to hash large amounts of passwords in parallel without wanting to pull in Tokio and all of its networking code. WebAssembly is a making a big rise, and Tokio doesn't work on wasm. Granted, futures-cpupool doesn't either but I still wanted to mention wasm as another example that using futures does not have to imply using Tokio.

  • How bad is it to have two cpu-pools running anyway? Tokio's tokio_threadpool::blocking can spin up to 100 threads by default, how bad is a few more? In fact, it might actually be better to have this dedicated cpu-pool running alongside tokio's threadpool, because this cpu-pool has (by default) exactly as many threads as the number of logical cores, which makes sense when the task they are doing is cpu-bound (hashing passwords). On the other hand, tokio_threadpool::blocking can be used for all kinds of different purposes, both cpu-bound and non-cpu-bound (e.g. working asynchronously with blocking APIs). You don't want to have 100 threads hashing passwords on a quad core CPU.

@jonhoo
Copy link
Author

jonhoo commented Oct 2, 2018

I agree! It would make sense to move the futures/non-blocking stuff under a futures feature so that tokio doesn't get pulled in each time. It could be that rustasync/team#56 leads to a "generic" async spawn we could depend on as well, but that's a far way off if it'll happen at all.

Command-line tools or desktop password managers may want to hash large amounts of passwords in parallel without wanting to pull in Tokio and all of its networking code.

If all they want to do is hash in parallel, then I think they should do so using rayon. For that particular use-case, there is no reason to use futures at all. The big advantage of non-blocking is that you can do other things at the same time (primarily I/O), but if the application doesn't want to do other things (like continue to run an event loop) then they may as well just run them in parallel. Which rayon provides super-simple mechanisms for doing.

How bad is it to have two cpu-pools running anyway?

It depends how you count "badness". The user now needs to configure two threadpools instead of one, and also try to figure out how many threads to give each one. If you instead just have one pool, you can let that pool figure it out on its own. It also means we avoid making the user run two threadpools that behave differently, and may have to be debugged separately if something misbehaves.

because this cpu-pool has (by default) exactly as many threads as the number of logical cores, which makes sense when the task they are doing is cpu-bound (hashing passwords)

I'm not sure I fully buy this argument. The whole point of non-blocking is that you are also doing other things, and you need to make progress on those too. If you gave all the cores to compute-heavy tasks, then you wouldn't make any progress on those other asynchronous tasks. If indeed all you want to do is hash in parallel, then rayon would be the way to do.

You don't want to have 100 threads hashing passwords on a quad core CPU.

This is true. You can set the number of threads in a tokio pool that are allowed to be blocking at any given time, and in general those should all be compute-bound. But, as you observe, blocking can also be used to wrap otherwise non-blocking API that do I/O or something, in which case you would now either not be using all your cores for hashing, or you'd potentially be using too many threads for hashing.

@bcmyers
Copy link
Owner

bcmyers commented Oct 2, 2018

Hey. Author here. First, let me say it’s so awesome jonhoo that you chose to work on my crate on your livestream! I’m a big fan of your videos and have learned a ton from them. Really appreciate what you’re doing. Second, I’d like to apologize for not engaging sooner, but I have an excuse :). I moved into a new apartment yesterday; so my time has been fully taken up with the move. I probably won’t be able to fully engage on this until the end of the week, as the focus now is unpacking, buying missing furniture, and all the other tasks that come with moving into a new place. In any case, thanks so much for the PR jonhoo and your engagement Migi! I look forward to plugging in soon.

@jonhoo
Copy link
Author

jonhoo commented Oct 2, 2018

No worries at all! We all have lives, and they take precedence :) Glad you enjoyed the streams!

@michiel-de-muynck
Copy link

Yeah, no worries @bcmyers. Life comes first! And also, 2 days is not a long time, you certaintly don't need to apologize for that.

Actually I have another small nit to pick about the PR. I'm sorry @jonhoo if this comes across as me trying to bring you down or something, that is absolutely not the intention. I enjoyed your livestream a lot! But I did just spent the past few hours thinking about threadpools and tokio's blocking() and noticed that blocking() doesn't actually panic when tokio isn't running (the doc comments in the PR say that it does). Instead, it returns a BlockingError.

I tried testing this in the code (the examples example_non_blocking.rs and example_actix_web.rs weren't actually updated start a tokio threadpool), but I couldn't get argonautica to build (the build script threw some file not found error, I think it's because I'm on windows and there's probably something wrong with my C/C++ build system).

@bcmyers
Copy link
Owner

bcmyers commented Oct 2, 2018

@migi - Still can’t fully engage (am writing from my mobile), by have a quick thought. Did you “git submodule init”? If not, might be your issue as just cloning the repo won’t bring in the C code submodule; you need to initialize it.

@jonhoo
Copy link
Author

jonhoo commented Oct 2, 2018

@migi Don't worry about it -- not taking it personally :) The goal for both of us is to make the software better!

You're right that we won't actually panic as the doc comment currently states. Instead, it will return ErrorKind::PoolTerminated, which is probably the right thing to do :) We could probably improve the context of that to say that the pool may not be running though.

@carllerche
Copy link

carllerche commented Oct 3, 2018

I don't see why depending on tokio here is required. It should be possible to only depend on the sub crates that are needed (making the dependency lighter).

Edit: If the goal is spawning, tokio-executor provides the spawn function and DefaultExecutor type.

We got rid of tokio::spawn, so we now only need tokio-threadpool!
@jonhoo
Copy link
Author

jonhoo commented Oct 3, 2018

Ah, good point! Fixed in 61b447c

@mehcode
Copy link

mehcode commented Dec 27, 2018

@bcmyers Any chance of getting this merged in?

@mehcode
Copy link

mehcode commented Dec 27, 2018

@jonhoo I get PoolTerminated when trying to use this. I'm using actix-web with actix-web-async-await.

@bcmyers
Copy link
Owner

bcmyers commented Mar 5, 2019

My apologies everyone for letting this PR fall by the wayside. @mehcode, I was also having problems getting it to work with actix-web, which is why I didn't merge it originally. I just never had the time to dig in. I'm going to close this now as I think it'll be better to wait for all the new async stuff to land.

@bcmyers bcmyers closed this Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants