Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvements] - Use Concurrent::FixedThreadPool to control compressors threads count #376

Merged
merged 3 commits into from
Sep 6, 2016
Merged

Conversation

lucasallan
Copy link
Contributor

Some improvements over #339

@hansottowirtz
Copy link
Contributor

Works for me!

@schneems schneems merged commit 8133018 into rails:master Sep 6, 2016
@schneems
Copy link
Member

schneems commented Sep 6, 2016

Thanks!

@lucasallan lucasallan deleted the thread-pool branch September 6, 2016 19:07
@matthewd
Copy link
Member

matthewd commented Sep 7, 2016

This is still capable of hanging 😕

@lucasallan
Copy link
Contributor Author

@matthewd I believe that's a different issue, with this PR we're just limiting the number of threads we're creating - therefore decreasing the context switch between threads.

For the hanging threads, we can have a timeout and use it within Concurrent::Future#wait_or_cancel(timeout).

I can implement it, we just need to sort out the details.

Thoughts? @schneems @matthewd

@matthewd
Copy link
Member

matthewd commented Sep 7, 2016

I don't know of any existing problem that could cause hangs (though pathological inputs can certainly cause issues inside individual transformation steps, that's not really a Sprockets problem)... and I don't see any benefit to introducing a timeout.

I was specifically claiming that this PR introduced a possible deadlock -- I've just checked and discovered concurrent-ruby offers a stronger ordering guarantee than I was expecting, so my concern scales back to the fact we're doing half as much concurrent work as we should be: we're only allowing N tasks to run concurrently, but half of them are doing nothing but waiting for others.

Do we need to shut down the executor when we're finished? At a glance it looks like we do.

@schneems
Copy link
Member

schneems commented Sep 7, 2016

Do we need to revert this?

On the timeout front, there is one pathological case I see frequently enough to warrant attention at Heroku. Where there is an asset that cannot be built, it gets stuck. It's impossible to figure out what asset that is as there's only logging after the asset is written and there are no timeouts. I would like to eventually address this, however I believe it's unrelated to the current issue at hand.

@lucasallan
Copy link
Contributor Author

If these threads are not being used for cpu-bound operations (which seems to be the case) then yes, we should revert it.

Otherwise, for cpu bounds operation I think we should keep a local thread pool based on the number of cpu.

@lucasallan
Copy link
Contributor Author

Line 187 it's an IO bound operation, so it should use the global thread pool.
Line 200 it's a mix of IO bound and CPU bound (compression) so it should use the FixedThreadPool.

Thoughts?

If you guys agree with that, then I'll go ahead and open a PR to fix that (I already have it ready).

@matthewd
Copy link
Member

matthewd commented Sep 8, 2016

My concern is more about "spending" a CPU on the nothing-bound operation on line 201.

Promises might allow us to do better there.

@lucasallan
Copy link
Contributor Author

It might occurs with or without these changes. I think first, we should use the global thread pool for the IO operation and keep using the FixedThreadPool for the CPU bound as proposed above.

After that, we can discuss about the deadlock case. Thoughts?

@schneems
Copy link
Member

schneems commented Sep 8, 2016

Here's what @matthewd is describing in a few more words

Here's the logic we previously had. We will spin up a new thread to write a file. Immediately we spin up another thread to compress that file. If the file hasn't been written yet that second thread must wait on the first. In the case where we have essentially unlimited threads this is fine. Having one thread wait on another causes little overhead other than the creation of the thread.

Here's the logic we now have. Create N threads. We will pull out a thread to write a file. We now have N - 1 available threads to do work. Immediately we pull out a another thread to compress that file, we now have N - 2 threads available to do work. If the file hasn't been written yet that second thread must wait on the first. In this case we may have a large number of threads doing nothing but waiting on other threads to finish while they're also taking up a resource (a thread from the pool) that does nothing but waits at the beginning, meanwhile this prevents that same thread from being used to write another file. In the case where N = 4 then we would have 2 threads writing files while 2 threads sit and do nothing for a time.

He's proposing this as a fix to the inefficiency:

Concurrent::Promise.new { write_the_file }.and_then { compress_the_thing }

@hansottowirtz
Copy link
Contributor

Can't we just:

Concurrent::Future.execute(executor: executor) {
  asset.write_to target
  gzip.compress target
}

@lucasallan
Copy link
Contributor Author

Got it. I'll work in a proposal and get back to you guys. How does that sound?

@lucasallan lucasallan restored the thread-pool branch September 8, 2016 21:18
@lucasallan lucasallan mentioned this pull request Sep 8, 2016
@schneems
Copy link
Member

It doesn't look like calling Promise#wait! does what either of us thought it would ruby-concurrency/concurrent-ruby#575. Also I'm working in parallel on another PR that happens to touch this code for exporting assets.

@schneems
Copy link
Member

Wrong PR

@wjordan wjordan mentioned this pull request Apr 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants