Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.x] Max workers per queue for auto-scaling (simple FIFO) #670

Closed
wants to merge 19 commits into from

Conversation

LasseRafn
Copy link
Contributor

@LasseRafn LasseRafn commented Sep 20, 2019

This will allow you to set a maxQueueProcesses config on the supervisor, which limits the amount of processes for each queue in the supervisor.

Think of this as a simple way to eg. implement FIFO; by setting the variable to 1, but keeping the max supervisor processes high. It would then never scale beyond one process for each queue, but the total processes running could be as high as queues * maxQueueProcesses (3 queues * maxQueueProcesses(1) = 3 processes in the supervisor).

It is especially useful for a multi-tenant setup where you'd have maybe a queue per tenant, and want a total of X processes, but still only want N processes for each tenant/queue.

I can't think of any use cases besides (naive) FIFO where maxQueueProcesses is set to 1.

It is implemented as opt-in so it shouldn't break for anyone who doesn't add the config key.

This does not work with balance => false.

A test has been added as well.

(Naming and implementation might be in need of a change!)

@taylorotwell
Copy link
Member

Can you tell me more about how you manage a queue per tenant. How do you define them in your Horizon configuration?

@LasseRafn
Copy link
Contributor Author

Sure thing @taylorotwell !

As of right now, its fairly hacky, I'm still implementing this. I made a command that I run instead of php artisan horizon, currently named: php artisan run-horizon which replicates the horizon command, but first calls this:

config()->set( 
    'horizon.environments.production.transactions.queue', 
    Tenant::all()->map( function ( $a ) { return 'transactions:' . $a->id; } )->values()->toArray() 
);

(modified for the sake of example)

So what it does is configure a queue in the supervisor per tenant, and then run horizon as usual. When a new tenant is created, I would call horizon:terminate, and my supervisor would call run-horizon again at which the newly created tenant would also get a queue.

Again, this is / feels hacky and I have yet to test it with our ~700 tenants. Still not sure this would be the right approach, but thats the only way I've found so far 🤔

@driesvints driesvints changed the title Max workers per queue for auto-scaling (simple FIFO) [3.x] Max workers per queue for auto-scaling (simple FIFO) Sep 24, 2019
@driesvints
Copy link
Member

driesvints commented Sep 27, 2019

@LasseRafn I fixed the tests on the 3.x branch so you might want to merge that in and also fix the StyleCI issues.

@taylorotwell
Copy link
Member

See @driesvints comment about tests.

@rauanmayemir
Copy link

@LasseRafn This is really exciting. I have a strong need in 'exclusive' FIFO and just started looking into horizon source code, can I help you with this?

Is there ever a case where you'd need to set maxQueueProcesses to something other than 1?

I was thinking about something like 'balance' => 'fifo' and probably setting wildcard ['transactions:*'], but your runtime config setting hack would work, too. 😁

@LasseRafn
Copy link
Contributor Author

@driesvints thanks! :) I'll check

@rauanmayemir I think thats a good point. I've been thinking about this too, and figured I'd rather just let it be flexible.. BUT since balance also has to be set to auto (and maybe simple too?) but never false then maybe having it be fifo is clearer and avoids confusion.

I would LOVE a wildcard, but I have no clue if that works? 😬

One concern I have is that in a load-balanced environment, two or more supervisors would be setup.. I wonder if that would mean my hard-limit of one worker would suddenly only be one PER SUPERVISOR. I have yet to test this.

@rauanmayemir
Copy link

It's definitely going to be a per supervisor cap, but it's a start.

Globally unique could be achieved with some adjustments to SupervisorRepository, it would potentially allow splitting tenants between masters.

@LasseRafn
Copy link
Contributor Author

yeah, there'd need to be a limit for this. Kind of defeats the purpose 😬

Also, apologies for the many commits. StyleCI and I weren't good friends.

@driesvints
Copy link
Member

@LasseRafn maybe squash your commits

@LasseRafn
Copy link
Contributor Author

good idea 👍

I'm currently working on a change which will:

  1. Redo the implementation to add the new fifo balancing type.
  2. Check all active supervisors and check if they have active workers running for the queue, and if so then not spin up another worker.

#2 should combat my worry about load-balanced setups, and I'm thinking maybe it could be used to "fix" #470

Right now I'm just testing the implementation, and I'll share my thoughts and code asap. The gist of it is:

When we check how many processes to assign for a queue, in an individual supervisor; IF the balancing method is fifo then check all active supervisors for their active processes and if has any processes elsewhere, then assign zero to this supervisor.

My worry is the performance implications of checking all supervisors every time we check how many workers to assign. Also adding SupervisorRepository calls into the AutoScaler seems a bit like mixing concerns..?

This above logic could solve #470 maybe with a flag (irrelevant to this PR or FIFO!) which basically is "max workers globally" (or whatever naming) so if set to 10 workers, no matter the amount of supervisors, this number will never be exceeded? Anyway, thats another discussion and again.. irrelevant to this PR.

I'll show my solution once its working and stable on my machine.

@taylorotwell
Copy link
Member

Has some failing tests. But, I think I will hold off on this for now. The queue logic around Horizon is already complicated and I don't want to add another layer to that complication that I have to maintain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants