Ensure single source of truth for generated values. #10267

isubasinghe · 2022-12-21T01:17:35Z

Summary

We need to ensure that when we generate facts, say for example the pod name, we only do so once.

Use Cases

This would significantly reduce the number of bugs we have.
For example, when investigating this issue: #10124

Something I noticed was that the output between ./argo get -json and the UI were slightly different.

I've had similar issue with a Mutex issue linked here.

This generation of facts every time we need them has been quite problematic in my opinion and we should refactor the codebase to eliminate this behaviour. I believe doing so will eliminate an entire class of bugs from our codebase.

The text was updated successfully, but these errors were encountered:

sarabala1979 · 2023-06-15T16:58:02Z

@isubasinghe can we have a holistic approach to solve the POD_NAME issue for all scenarios? Can you propose the solutions?

terrytangyuan · 2023-06-22T00:35:44Z

Related issues #7646, #9906, #10107, #10124, #11250

isubasinghe · 2023-06-22T01:05:00Z

@sarabala1979 we really need some data store, configmaps can be used but obviously have the 1MB limit imposed by etcd.

Even dealing with the configmap limit is still a better option than what we are doing now, because there are quite a lot of bugs relating to the node names alone.

Then there is also the mutex issue, which is ultimately the same problem.

Either we make mysql or postgres mandatory and we rely on that to store everything or we use something like this:
https://github.com/dgraph-io/badger on the controller side running as an embedded data store.

The way badger works and the way our workflows work, I do not foresee any scaling problems running this as an embedded datastore.

I personally like the option of using badger because it doesn't require any architectural changes or requirements and I am fairly experienced in the use of these key value datastores since I have used RocksDB in the past when building a custom database.

That is my conclusion, use badger or (postgres|mysql) and store everything we need in it.

I do not think patching these bugs as they appear is a good solution, we need to extinguish this problem at the source.

isubasinghe · 2023-06-22T01:06:05Z

Also pinging @terrytangyuan and @juliev0 so that they can have a read.

isubasinghe · 2023-06-22T01:11:47Z

Actually correct me if I am wrong about Postgres| MySql not being mandatory, if it is mandatory, makes more sense to just use that.

sarabala1979 · 2023-06-22T16:03:24Z

@sarabala1979 we really need some data store, configmaps can be used but obviously have the 1MB limit imposed by etcd.

Even dealing with the configmap limit is still a better option than what we are doing now, because there are quite a lot of bugs relating to the node names alone.

Then there is also the mutex issue, which is ultimately the same problem.

Either we make mysql or postgres mandatory and we rely on that to store everything or we use something like this: https://github.com/dgraph-io/badger on the controller side running as an embedded data store.

The way badger works and the way our workflows work, I do not foresee any scaling problems running this as an embedded datastore.

I personally like the option of using badger because it doesn't require any architectural changes or requirements and I am fairly experienced in the use of these key value datastores since I have used RocksDB in the past when building a custom database.

That is my conclusion, use badger or (postgres|mysql) and store everything we need in it.

I do not think patching these bugs as they appear is a good solution, we need to extinguish this problem at the source.

Can you summarize Mutex issues? is it because of node id/name mismatch causing the issue or anything else?
I believe if we fix the nodeID/Name issue that will fix the all mutex and status mismatch issues.

terrytangyuan · 2023-06-23T01:26:24Z

Actually correct me if I am wrong about Postgres| MySql not being mandatory, if it is mandatory, makes more sense to just use that.

It's only required when persistence/workflow archiving is enabled.

Then there is also the mutex issue, which is ultimately the same problem.

Could you clarify what you meant by this? Any context?

I personally like the option of using badger because it doesn't require any architectural changes or requirements and I am fairly experienced in the use of these key value datastores since I have used RocksDB in the past when building a custom database.

We should stay away from any external dependencies like DB unless really necessary and leverage any cloud-native solutions whenever possible.

juliev0 · 2023-06-23T18:00:30Z

So, this is related to re-generating information over and over, right? Is there a reason we prefer a ConfigMap over using the Workflow Status? Is it because the information is too verbose and more likely to hit the size limit?

isubasinghe · 2023-06-24T01:18:33Z

@sarabala1979 this gives an example of why mutex release fails: https://github.com/argoproj/argo-workflows/blob/49865b1783b481ba7600441999559821d1a03a18/docs/proposals/mutex-improvements.md

I do not think the fix is that easy, the information available on acquire is different to the information available on release, that is why two functions exist for getting the key. The two functions actually happen to generate different keys, so you end up with the bug.

We probably can use a configmap for this particular issue because keys aren't going to be large and also keys will be released after the workflow is completed. Additionally a workflow being paused because it is waiting for space in the configmap is probably a better scenario than the mutex not really working.

also see the linked doc @terrytangyuan, it shows an example of the mutex failing, should be easy to understand.

isubasinghe · 2023-06-24T01:33:24Z

We should stay away from any external dependencies like DB unless really necessary and leverage any cloud-native solutions whenever possible.

@terrytangyuan
I generally agree on this, but I think this is a tradeoff, storing items will give higher confidence than generating them.
I prefer the higher confidence even though I know introducing any sort of persistence on k8s is asking for pain (especially in the context of multi-zone clusters).

I do know a workaround for etcd that will allow us to store items of any size on it, it will require additional work on my end to do so though. We can essentially break a single entry into multiple keys and use etcd directly, this is safe because it supports transactional writes/reads. Obviously increases the maintenance cost on our end.

isubasinghe · 2023-06-24T01:34:24Z

@juliev0 Yeah I believe this is correct, we generate information over and over to avoid dealing with configmap limitations (1MB).

isubasinghe · 2023-06-24T01:59:56Z

As a concluding remark: I think we can improve the current re-generation of facts and reduce the number of these type of bugs.

However, I do not think we can reduce these bugs to an absolute zero, for example, the typescript and Go have their own implementations of node names.

We can eliminate this class of bugs from existing in the project entirely if we store them somewhere, to me that is the superior solution, despite that having many downsides to it as well (esp in the context of multi-zone deployments of k8s).

It's better to tackle the root problem, because I think it will bring greater stability to the project, fixing these sorts of bugs is frankly quite demoralizing (for me at least). And I am not sure the advantages brought by being stateless are worth the costs of being stateless, as evidence to this I will point to this issue: #8684

alexec · 2023-07-11T07:37:38Z

I agree with this. For pod names, we should store them in status.node[*].podName rather than looking at the annotation.

The pod name can be fixed when the NodeStatus is created. There is only one piece of code that does this, so it will be quite robust.

This has been the cause of a number (admittedly minor) annoying issues.

alexec · 2023-07-11T07:41:04Z

@JPZ13 do you want to create a separate issue for pod names?

JPZ13 · 2023-07-11T20:33:12Z

@alexec We're going to have @isubasinghe handle it holistically once he gets back from vacation

aaron-arellano · 2023-08-21T19:55:35Z

adding my comment here from the other issue..

We use Argo heavily at ZG and when upgrading to v3.4 of argo workflows we noticed breaking changes this causes outside of the nodeId in the workflow. This v2 naming convention also breaks upstream k8s HOSTNAME env variable for the pod. For instance, we get the HOSTNAME in the workflow pod and run the kubernetes api call to get_namespace_pod with that HOSTNAME the pod name returned from the kubernetes api server does not match the pod name in the actual pod metadata.name. Not sure if there is some weird concatenation going on that is not persisting to etcd but the downard API does not match what is in etcd. I reverted the env var POD_NAMES to v1 and everything works in v3.4. I feel with all the bugs this v2 pod name should be reverted becuase the scope of breaking changes persists beyond argo itself and into kubernetes.

on another note, suggest using labels to store pod <-> workflow metadata instead of trying to generate a name that breaks so many scenarios in and out of argo. Labels would be a cleaner approach and will still be searchable by anyone using Argo both via cli or the Ui.

JPZ13 · 2023-08-22T20:20:34Z

@aaron-arellano could you post the pod name to host name mappings that you're seeing? I've also sent you an email and copied @isubasinghe. If you would prefer to keep those values private, you can share with us via email. This will help clue us in to concatenation issues

aaron-arellano · 2023-08-23T19:20:40Z

So here is an example of the error I saw with pod v2 name. You can see in the log file attached we call the kubernetes python library function read_namespaced_pod. in the workflow itself we call the below code: ```python current_pod_name = os.environ.get("HOSTNAME", None) current_pod_namespace = os.environ.get("POD_NAMESPACE", None) if current_pod_name and current_pod_namespace: return self.v1.read_namespaced_pod( namespace=current_pod_namespace, name=current_pod_name ) ``` HOTSNAME as you know is derived from default kubernetes API functionality in the container to the pod it is hosted on, ref<https://kubernetes.io/docs/concepts/containers/container-environment/#container-information>. With pod name v1 everything works fine but on v2 name this fails with the logs attached. The actual pod name was aip-toleration-integration-test-zlntf-nvidia-tesla-v100-4115184276 when getting HOSTNAME from within the v2name pod we see the last 3 chars of the real pod name are not there. This consistenly happens when running get_namespaced_pod in a workflow pod generated under the v2 name feature. Again, when I revert POD_NAMES in workflow-controller and arg-server to v1 this scenario works fine. Container Environment<https://kubernetes.io/docs/concepts/containers/container-environment/#container-information> This page describes the resources available to Containers in the Container environment. Container environment The Kubernetes Container environment provides several important resources to Containers: A filesystem, which is a combination of an image and one or more volumes. Information about the Container itself. Information about other objects in the cluster. Container information The hostname of a Container is the name of the Pod in which the Container is running. It is available through the hostname command or the gethostname function call in libc. kubernetes.io

isubasinghe added the type/feature Feature request label Dec 21, 2022

chensun mentioned this issue May 9, 2023

Invalid node IDs in workflow.status #10107

Open

3 tasks

JPZ13 mentioned this issue Jun 28, 2023

argo get displays incorrect PODNAME when Node has a templateRef #11250

Closed

3 tasks

caelan-io mentioned this issue Jul 17, 2023

Default initialised NodeStatus causes random false failures. #11102

Closed

3 tasks

aaron-arellano mentioned this issue Aug 21, 2023

WIP: Add Task names to Pod names #1320

Closed

isubasinghe mentioned this issue Nov 30, 2023

Workflow deadlocks on mutex in steps template if controller is restarted #8684

Closed

3 tasks

This was referenced Jan 16, 2024

feat: store podname in nodestatus #12503

Draft

Store podname in Node status to ensure single source of truth for PodName #12528

Open

agilgur5 added type/tech-debt area/ui area/controller Controller issues, panics area/server labels May 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure single source of truth for generated values. #10267

Ensure single source of truth for generated values. #10267

isubasinghe commented Dec 21, 2022 •

edited

Loading

sarabala1979 commented Jun 15, 2023

terrytangyuan commented Jun 22, 2023

isubasinghe commented Jun 22, 2023 •

edited

Loading

isubasinghe commented Jun 22, 2023

isubasinghe commented Jun 22, 2023

sarabala1979 commented Jun 22, 2023

terrytangyuan commented Jun 23, 2023

juliev0 commented Jun 23, 2023

isubasinghe commented Jun 24, 2023

isubasinghe commented Jun 24, 2023

isubasinghe commented Jun 24, 2023

isubasinghe commented Jun 24, 2023 •

edited

Loading

alexec commented Jul 11, 2023 •

edited

Loading

alexec commented Jul 11, 2023

JPZ13 commented Jul 11, 2023

aaron-arellano commented Aug 21, 2023 •

edited

Loading

JPZ13 commented Aug 22, 2023

aaron-arellano commented Aug 23, 2023 via email •

edited by agilgur5

Loading

Ensure single source of truth for generated values. #10267

Ensure single source of truth for generated values. #10267

Comments

isubasinghe commented Dec 21, 2022 • edited Loading

Summary

Use Cases

sarabala1979 commented Jun 15, 2023

terrytangyuan commented Jun 22, 2023

isubasinghe commented Jun 22, 2023 • edited Loading

isubasinghe commented Jun 22, 2023

isubasinghe commented Jun 22, 2023

sarabala1979 commented Jun 22, 2023

terrytangyuan commented Jun 23, 2023

juliev0 commented Jun 23, 2023

isubasinghe commented Jun 24, 2023

isubasinghe commented Jun 24, 2023

isubasinghe commented Jun 24, 2023

isubasinghe commented Jun 24, 2023 • edited Loading

alexec commented Jul 11, 2023 • edited Loading

alexec commented Jul 11, 2023

JPZ13 commented Jul 11, 2023

aaron-arellano commented Aug 21, 2023 • edited Loading

JPZ13 commented Aug 22, 2023

aaron-arellano commented Aug 23, 2023 via email • edited by agilgur5 Loading

isubasinghe commented Dec 21, 2022 •

edited

Loading

isubasinghe commented Jun 22, 2023 •

edited

Loading

isubasinghe commented Jun 24, 2023 •

edited

Loading

alexec commented Jul 11, 2023 •

edited

Loading

aaron-arellano commented Aug 21, 2023 •

edited

Loading

aaron-arellano commented Aug 23, 2023 via email •

edited by agilgur5

Loading