Container sequences #2551

alexec · 2020-03-30T23:31:28Z

Summary

It should be possible to run multiple steps within the same pod ~~using ephemeral containers~~.

Motivation

Avoids the need to pass artifacts around.

Proposal

TODO

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

tvalasek · 2020-04-14T00:59:34Z

Would that work for: Avoids the need to pass output parameters around?

simster7 · 2020-04-14T01:10:09Z

Yes! That's the main advantage

simster7 · 2020-04-15T17:02:11Z

The main advantage of this feature would be to avoid passing artifacts using an external provider between different tasks in a Workflow, when the intermediary artifacts can be discarded after use.

To achieve this, we would make use of ephemeral containers in K8s. The idea is that the controller would create and remove ephemeral containers in a single pod, allowing them to all use the same filesystem

I envision something like a steps template:

- name: sequence
  sequence:
    - - name: create-artifact
        template: gen-data
    - - name: consume-artifact
        template: process-data

- name: gen-data
  container:
    ...
  outputs:
    artifacts:
      file: ...

- name: process-data
  inputs:
    artifacts:
      file: ...
  container:
    ...

Ideally, users would simply be able to rename steps to sequence in order to leverage this feature. The controller would only need the existing inputs/outputs already found in templates to achieve this.

NOTE: This feature is still only an idea: we're about to start creating a PoC to see just how viable it is. Nothing is set in stone (not even the name sequence) and I expect this to change as we learn more about the limitations/features of this. All feedback is welcome at this time!

ddseapy · 2020-04-15T21:47:27Z

Seems like a great idea and very useful. Just a couple thoughts:

Ephemeral containers are in alpha and have a lot of downsides (https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages)
Docs suggest use-case is generally things like debugging running containers. I wonder if there will be an gotchas with using a lot of them for heavier tasks like data processing.

For argo on production clusters, it might be a capability not exercised for a while. That said, benefits might outweigh the risks for certain use-cases.

simster7 · 2020-04-19T18:05:36Z

You are very much correct @ddseapy. We are definitely treating this as an experimental feature

simster7 · 2020-06-02T14:40:39Z

An update on this: given some limitations placed by K8s on this feature – mainly the inability to replace or modify individual ephemeral containers in a Pod and only replace the entire list of ephemeral containers as an operation – we don't think this feature as described is currently feasible.

However, I'll investigate if we can take advantage of this feature for other purposes, such as a streamlined "Retry" node that performs its retries on the same Pod, saving the need to create new ones and download artifacts every time.

alexec · 2020-06-29T23:35:49Z

@simster7 could you please close this issue this feature is not possible and open a new issues for "in-place retries" so that issues 👍 is reflective of the popularity of that issue?

alexec · 2020-07-13T19:10:45Z

@simster7 bump!

simster7 · 2020-07-14T18:26:26Z

Closing this as it is currently implausible. Related: #3475

alexec · 2021-01-18T19:44:39Z

Sequenced Containers the Tekton Way

Similar to how Tekton does it:

https://github.com/tektoncd/pipeline/tree/master/cmd/entrypoint

How this works:

The pod has a volume shared with all containers.
A init container copies a binary to each volume.
That binary replaces the original command, running the original comman as a sub-process.
Before it starts the sub-process, it waits for a specific file to appear. This file is created by another container when it believes that container is ready to start.
When completes, another files is written with the outputs.

How could workflows uses this?

Simpler and more powerful executor:

As the binary runs in the same process namespaces as the sub-process, it can easily copy inputs and capture outputs without any of the magic container runtime executors need to use. Specifically, this would very well with runAsNonRoot.

This also removes the need for a wait container. This would reduce costs.

See #4186

Many steps within a pod:

This model would allow the wait process to read the state of the workflow from the shared volume. It could effectively execute an entire workflow within a single pod.

However, this has some scaling issues. We could not run a 1000 step workflow like this. Because each container must be spun up to wait, there will be many cases where we're consuming resources, but doing no useful work.

See #2551

There are some really interesting challenges about how pods report back status for the workflow for this. We'd need to multiplex it so we might want to address at the same time as #3961.

Signed-off-by: Alex Collins <alex_collins@intuit.com>

alexec added the type/feature Feature request label Mar 30, 2020

alexec added this to the v2.9 milestone Mar 30, 2020

simster7 self-assigned this Mar 30, 2020

alexec modified the milestones: v2.9, v2.10 May 28, 2020

alexec removed this from the v2.10 milestone Jun 26, 2020

simster7 removed their assignment Jul 10, 2020

simster7 self-assigned this Jul 13, 2020

simster7 mentioned this issue Jul 14, 2020

In-place retries using ephemeral containers #3475

Open

simster7 closed this as completed Jul 14, 2020

alexec reopened this Jan 18, 2021

alexec mentioned this issue Jan 18, 2021

Sequenced containers within a single pod #4897

Closed

alexec added epic/scaling labels Jan 18, 2021

alexec mentioned this issue Jan 18, 2021

Volume-based artifact passing system #1349

Open

alexec assigned alexec and unassigned simster7 Feb 1, 2021

This was referenced Mar 2, 2021

feat(controller): Container set template. Closes #2551 #5099

Merged

container set next steps: enhanced depends logic #5281

Open

container set next steps: container replicas #5282

Closed

rushtehrani mentioned this issue Mar 4, 2021

Support for Passing Data among Argo Jobs onepanelio/onepanel#612

Open

alexec closed this as completed in #5099 Mar 4, 2021

alexec added a commit that referenced this issue Mar 4, 2021

feat(controller): Container set template. Closes #2551 (#5099)

8729587

Signed-off-by: Alex Collins <alex_collins@intuit.com>

This was referenced Mar 8, 2021

v2.12.10 cherry-pick #5326

Closed

v3.0.0-rc5 cherry-pick #5327

Closed

alexec mentioned this issue Jul 22, 2021

invalid container names (i.e. not "main") are allowed and causes bugs #6405

Closed

agilgur5 added the area/templates/container-set label Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Container sequences #2551

Container sequences #2551

alexec commented Mar 30, 2020 •

edited

Loading

tvalasek commented Apr 14, 2020

simster7 commented Apr 14, 2020

simster7 commented Apr 15, 2020

ddseapy commented Apr 15, 2020

simster7 commented Apr 19, 2020

simster7 commented Jun 2, 2020

alexec commented Jun 29, 2020

alexec commented Jul 13, 2020

simster7 commented Jul 14, 2020

alexec commented Jan 18, 2021 •

edited

Loading

Container sequences #2551

Container sequences #2551

Comments

alexec commented Mar 30, 2020 • edited Loading

Summary

Motivation

Proposal

tvalasek commented Apr 14, 2020

simster7 commented Apr 14, 2020

simster7 commented Apr 15, 2020

ddseapy commented Apr 15, 2020

simster7 commented Apr 19, 2020

simster7 commented Jun 2, 2020

alexec commented Jun 29, 2020

alexec commented Jul 13, 2020

simster7 commented Jul 14, 2020

alexec commented Jan 18, 2021 • edited Loading

Sequenced Containers the Tekton Way

alexec commented Mar 30, 2020 •

edited

Loading

alexec commented Jan 18, 2021 •

edited

Loading