Feature req: Partitioning actions to execute on separate nodes #3096

vaishalipavashe · 2016-12-05T19:54:35Z

If actions can be partitioned like sensors we can have actions and sensors on separate instance in may be separate n/w zone where it can perform actions on nodes in that zone and report result to main instance in less secure zone where other services like WebUI and Slack are available.

Kami · 2016-12-05T19:58:14Z

Thanks - yes this is a good feature request and something we have already discussed in the past (runner affinity, would come especially handy for Python runner actions, etc.).

We don't have an ETA for it right now, but I do think it's something we will look into it in the future.

codyaray · 2016-12-27T18:10:05Z

My use case is an action-chain workflow where task1 writes a file and task2 uses that file. Our stackstorm cluster uses 4 nodes so task2 can't see the files created by task1 when the tasks run on different nodes. More concretely, task one calls openssl to write out a key and CSR. Task 2 emails those files to the requester.

A couple workarounds are to (a) use a shared filesystem or (b) pass everything as variables instead of files. Right now I'm trying to rework the workflow for (b) but haven't actually finished it yet.

Kami · 2016-12-27T18:12:31Z

There is another option c) we ourselves use sometimes - use datastore when possible :)

Although yeah, if it makes sense, b) is actually preferred since passing things around as parameters usually makes things more re-usable and easier to test.

rehmanzile · 2017-03-03T20:04:30Z

👍 Need it!

nmaludy · 2017-08-13T00:16:22Z

Wanted to add my use case for this and maybe some other features that may follow as a result.

Use Case 1 : Multiple Environments Same Datacenter

In a given datacenter we maybe have N environments present. Currently we are forced to deploying in one of two ways:

A shared deployment where all actionrunner instances have access to all machines in every environment because we don't know which actionrunner will receive the tasks.
Individual/isolated deployments. For some environments where we are not allowed to share infrastructure (usually for security reasons) we have another full StackStorm deployment where the actions are copied.

We would like to minimize the security impact of option 1. by only requiring a small number of instances (maybe 1-2) to have access to specific environments. This also might help convince security that types type infrastructure would be OK to serve some of our more sensitive areas.

Use Case 2 : Multiple Datacenters

When deploying a hybrid cloud or multi-datacenter fashion we currently have the same two options as above.

Have a single StackStorm deployment in an arbitrary datacenter, then perform all actions in other datacenters over the WAN. For datacenters that are geographic close this can be OK, but not ideal. For datacenters that are across country over overseas this incurs a pretty big performance hit by running SSH, WinRM and HTTP calls over the WAN.
Have a isolated StackStorm deployment in every datacenter. This solves the latency and throughput problem by having resources and workers local to the datacenter where actions need to occur. However it adds a great deal of complexity in terms of workflow design and message routing logic. Now all workflows need to start out by determining where the API call(s) and resources reside, then branching and either running the workflow locally or invoking that workflow on the StackStorm instance in another datacenter. This can compound when part of the workflow needs to be run in Datacenter A and part in Datacenter B then wait for both of those results, then run a final part back in Datacenter A again.

Ideal World

Personally i would like to only manage one StackStorm instance/cluster and spread out components as needed to various datacenters and environments. I know this comes with its own set of challenges for both StackStorm and the components underneath StackStorm and Mistral (rabbitmq, mongodb, postgres, etc).

Maybe if somehow multiple StackStorm instances could "cooperate" with each other in a more seamless fashion that might help a little. However, there is still the drawback that things like the datastore would need to be synchronized across deployments.

jlejeune · 2018-09-25T07:46:19Z

Need it too!

I think about a solution reposed on a centralized instanciation of Stackstorm with several "workers" deployed where we want as long as they can join st2api, RabbitMQ and Postgres servers.

My use case is almost the same as use case 2 of @nmaludy.

I actually need to deploy some "workers" on a specific datacenter with specific network access and I don't want to deploy there all Stackstorm services.

LindsayHill · 2019-03-28T21:13:46Z

Related note - the single-sensor-per-container mode lets you have a form of affinity for sensors. https://docs.stackstorm.com/reference/ha.html#st2sensorcontainer

djha-skin · 2019-04-12T20:09:35Z

Need it also.

I have a stackstorm high availability setup, with stackstorm installed on two virtual machine nodes.

My use case is that I want to run a pack install on both HA nodes. I want to install a stackstorm pack on both HA nodes. The problem is that when I run pack install, it puts an execution request on the rabbitmq through the API. Then one of the two HA nodes picks it up and runs it, and not necessarily the node that I want to run it on. This installs the pack (and therefore the pack’s files) on one of the nodes, but not necessarily on both.

I want to be able to put a action on the queue and force stackstorm to make it run on a particular node so that I can force the pack to install correctly on both.

LindsayHill · 2019-04-12T20:27:20Z

Pack install has a few other things to cover. I think that "want st2 pack install to work in HA environment" should be treated as a separate issue to "I want to be able to run an action on a specific action runner for networking or security reasons"

troshlyak · 2019-05-02T21:04:27Z

We are also exploring this option (node affinity based action execution) and would love to have it.

Our use case is similar to the above mentioned ones - we have multiple datacenters and we would ideally want to manage single Stackstorm instance with deployments in the different datacenters, where we can run actions specific to a given datacenter (for latency/performance gains and availability, not security concerns).

I've tried achieving this by "merging" only the mongodb service between the different instances, where I thought that this would give us common configuration (pack config/kv datastore etc.) and history between the instances, but still will allow us to target specific instance by using the API of specific node when running actions/workflows (as RabbitMQ is "local" for the datacenter instance). In practice though, we've noticed that even if we fire an action through API of instance A, it could eventually be picked up and executed by runners in instance B, which would mean that the initial request for action execution traverses through mongodb - maybe someone with more in depth knowledge around the internal Stackstorm communication between its components can explain why this happens.

LindsayHill · 2019-05-02T21:17:52Z

and would love to have it

I don't know of anyone else working on this, or anyone planning on doing so. PRs welcome. That's the only way it's going to get implemented. Probably better to open an issue with a proposed design first, before doing too much work. There's a few different ways this could be done, and I'm sure people will have opinions on how to do it.

I've tried achieving this by "merging" only the mongodb service between the different instances, where I thought that this would give us common configuration (pack config/kv datastore etc.) and history between the instances,

I'm fairly sure the internal system design never considered having a combined DB but separate instances of everything else, especially RabbitMQ. Not surprised by the results you got

punkrokk · 2019-05-02T21:20:53Z

I would like to participate in a design call here.

…

On May 2, 2019, at 5:18 PM, Lindsay Hill ***@***.***> wrote: and would love to have it I don't know of anyone else working on this, or anyone planning on doing so. PRs welcome. That's the only way it's going to get implemented. Probably better to open an issue with a proposed design first, before doing too much work. There's a few different ways this could be done, and I'm sure people will have opinions on how to do it. I've tried achieving this by "merging" only the mongodb service between the different instances, where I thought that this would give us common configuration (pack config/kv datastore etc.) and history between the instances, I'm fairly sure the internal system design never considered having a combined DB but separate instances of everything else, especially RabbitMQ. Not surprised by the results you got — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

LindsayHill · 2019-05-02T21:32:24Z

Set something up. Write up some proposals, figure out a suitable forum for discussion. If that means a call, figure out a time that works for interested parties.

minsis · 2021-03-10T22:59:20Z

Been a few years on this topic. Kind of curious if there was any ideas or designs proposed somewhere?

aliasboink · 2022-08-12T10:13:40Z

Hello, it's been a few more years on this. Is this in development by anyone and/or has anything been posted regarding it?

Kami added complexity:medium enhancement feature labels Dec 5, 2016

bigmstone mentioned this issue Mar 21, 2017

Feature Request - SSH Connection Caching/Multiplexing #3291

Closed

DesireWithin mentioned this issue Jun 3, 2024

Assign action to specific worker #6212

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature req: Partitioning actions to execute on separate nodes #3096

Feature req: Partitioning actions to execute on separate nodes #3096

vaishalipavashe commented Dec 5, 2016

Kami commented Dec 5, 2016

codyaray commented Dec 27, 2016

Kami commented Dec 27, 2016

rehmanzile commented Mar 3, 2017

nmaludy commented Aug 13, 2017

jlejeune commented Sep 25, 2018

LindsayHill commented Mar 28, 2019

djha-skin commented Apr 12, 2019

LindsayHill commented Apr 12, 2019

troshlyak commented May 2, 2019

LindsayHill commented May 2, 2019

punkrokk commented May 2, 2019 via email

LindsayHill commented May 2, 2019

minsis commented Mar 10, 2021

aliasboink commented Aug 12, 2022

Feature req: Partitioning actions to execute on separate nodes #3096

Feature req: Partitioning actions to execute on separate nodes #3096

Comments

vaishalipavashe commented Dec 5, 2016

Kami commented Dec 5, 2016

codyaray commented Dec 27, 2016

Kami commented Dec 27, 2016

rehmanzile commented Mar 3, 2017

nmaludy commented Aug 13, 2017

Use Case 1 : Multiple Environments Same Datacenter

Use Case 2 : Multiple Datacenters

Ideal World

jlejeune commented Sep 25, 2018

LindsayHill commented Mar 28, 2019

djha-skin commented Apr 12, 2019

LindsayHill commented Apr 12, 2019

troshlyak commented May 2, 2019

LindsayHill commented May 2, 2019

punkrokk commented May 2, 2019 via email

LindsayHill commented May 2, 2019

minsis commented Mar 10, 2021

aliasboink commented Aug 12, 2022