Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assumptions about queue message for a server are broken in pods #19957

Open
carbonin opened this issue Mar 12, 2020 · 4 comments
Open

Assumptions about queue message for a server are broken in pods #19957

carbonin opened this issue Mar 12, 2020 · 4 comments

Comments

@carbonin
Copy link
Member

Currently when running ManageIQ in pods from https://github.com/ManageIQ/manageiq-pods workers associated with the same "server" don't actually share a process space or filesystem as they are running in different containers.

This upends the assumption that process or filesystem level shared resources will be accessible to all workers on the same server. We use that assumption when we queue messages targeted for a specific server.

An example of this is the AnsibleRunnerWorkflow class. This job queues each of its states to avoid tying up a worker as the ansible playbook runs, but those queue messages are pinned to a specific server, not to a particular worker. This means that when running workers as separate containers we could clone the playbook repo on one container then try to run a playbook from that repo in a separate one. In the course of a single playbook run at least 3 messages are sent over the queue between cloning the repo and cleaning it up. It's very unlikely that all three messages get picked up by the same worker that queued the first one, so it's fair to consider this feature completely broken.

I'm sure there are other examples of this kind of behavior, but I don't know where they might be.

It seems like this might be an issue for smartstate if we are mounting something or trying to copy data to/from the local disk. Maybe V2V could also run into issues?

The best idea I have currently for this is to add a worker guid to queue messages if we need the message to be processed by a particular worker instance and use deliver_on to ensure the worker waits before consuming the message (think thread context switching). I don't like the idea of adding to the queue though so this may not be the best way forward.

From a pods-first viewpoint, I would like to spin up Job instances for this kind of work, then poll for the status of the job. It would be that separate pod's responsibility to take care of most of what we're doing in the workflow class. That way we don't have to worry about separate workers trying to monitor a process. I'm not sure what we would implement on an appliance that would fulfill the same use-case though.

I made a quick fix to the AnsibleRunnerWorkflow case in #19956, but this bug report is really about the larger issue so we can get input about other features that might be affected.

cc @Fryguy @chessbyte @NickLaMuro

@carbonin carbonin added the bug label Mar 12, 2020
@NickLaMuro
Copy link
Member

From a pods-first viewpoint, I would like to spin up Job instances for this kind of work

So I think I like this approach in general, I am just wondering if it makes sense to proactively look at more ways to increase the boot speed for Job pods, specifically since around a 10-15 second penalty (roughly) to have this booted into a new process every time just to spin up the work required seems like it wouldn't be scalable long term.

Though, speaking specifically about using this approach specifically for EmbeddedAnsible, only a very small subset of our app is really required to kick off a ansible-runner job (though, I guess a bit more if we are talking about having to run the GitWorktree code and load a few other database records.

Anyway, just the first thing that popped into my head after reading that.

@Fryguy
Copy link
Member

Fryguy commented Mar 12, 2020

I think the assumption is that ansible-runner itself would be the thing launched as a Kube Job.

In my mind I think we need a MonitoredProcess abstraction which then can have different backends for pods (Kube Jobs) and appliances (Processes) and then callers can just go through that abstraction. Haven't fully thought it through though, and I'm not sure how we'd handle things like the smart state cases.

@NickLaMuro
Copy link
Member

I think the assumption is that ansible-runner itself would be the thing launched as a Kube Job.

I feel like that is still missing a step, since you need the repo checked out on the same container/volume/whatever, and that isn't part of the CLI call to ansible-runner.

That said, I am unsure the capabilities of "Kube Jobs", and if they allow passing a file system of sorts along to the job to be executed, so my point might be moot if that is the case.

@carbonin
Copy link
Member Author

Yeah, at the very start of development of ansible_runner we asked for a container image that would handle checking out the repo and all of that, but I don't think that ever happened. I would imagine that's what we would build for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants