Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure single source of truth for generated values. #10267

Open
isubasinghe opened this issue Dec 21, 2022 · 18 comments
Open

Ensure single source of truth for generated values. #10267

isubasinghe opened this issue Dec 21, 2022 · 18 comments

Comments

@isubasinghe
Copy link
Member

isubasinghe commented Dec 21, 2022

Summary

We need to ensure that when we generate facts, say for example the pod name, we only do so once.

Use Cases

This would significantly reduce the number of bugs we have.
For example, when investigating this issue: #10124

Something I noticed was that the output between ./argo get -json and the UI were slightly different.

I've had similar issue with a Mutex issue linked here.

This generation of facts every time we need them has been quite problematic in my opinion and we should refactor the codebase to eliminate this behaviour. I believe doing so will eliminate an entire class of bugs from our codebase.

@sarabala1979
Copy link
Member

@isubasinghe can we have a holistic approach to solve the POD_NAME issue for all scenarios? Can you propose the solutions?

@terrytangyuan
Copy link
Member

Related issues #7646, #9906, #10107, #10124, #11250

@isubasinghe
Copy link
Member Author

isubasinghe commented Jun 22, 2023

@sarabala1979 we really need some data store, configmaps can be used but obviously have the 1MB limit imposed by etcd.

Even dealing with the configmap limit is still a better option than what we are doing now, because there are quite a lot of bugs relating to the node names alone.

Then there is also the mutex issue, which is ultimately the same problem.

Either we make mysql or postgres mandatory and we rely on that to store everything or we use something like this:
https://github.com/dgraph-io/badger on the controller side running as an embedded data store.

The way badger works and the way our workflows work, I do not foresee any scaling problems running this as an embedded datastore.

I personally like the option of using badger because it doesn't require any architectural changes or requirements and I am fairly experienced in the use of these key value datastores since I have used RocksDB in the past when building a custom database.

That is my conclusion, use badger or (postgres|mysql) and store everything we need in it.

I do not think patching these bugs as they appear is a good solution, we need to extinguish this problem at the source.

@isubasinghe
Copy link
Member Author

Also pinging @terrytangyuan and @juliev0 so that they can have a read.

@isubasinghe
Copy link
Member Author

Actually correct me if I am wrong about Postgres| MySql not being mandatory, if it is mandatory, makes more sense to just use that.

@sarabala1979
Copy link
Member

@sarabala1979 we really need some data store, configmaps can be used but obviously have the 1MB limit imposed by etcd.

Even dealing with the configmap limit is still a better option than what we are doing now, because there are quite a lot of bugs relating to the node names alone.

Then there is also the mutex issue, which is ultimately the same problem.

Either we make mysql or postgres mandatory and we rely on that to store everything or we use something like this: https://github.com/dgraph-io/badger on the controller side running as an embedded data store.

The way badger works and the way our workflows work, I do not foresee any scaling problems running this as an embedded datastore.

I personally like the option of using badger because it doesn't require any architectural changes or requirements and I am fairly experienced in the use of these key value datastores since I have used RocksDB in the past when building a custom database.

That is my conclusion, use badger or (postgres|mysql) and store everything we need in it.

I do not think patching these bugs as they appear is a good solution, we need to extinguish this problem at the source.

Can you summarize Mutex issues? is it because of node id/name mismatch causing the issue or anything else?
I believe if we fix the nodeID/Name issue that will fix the all mutex and status mismatch issues.

@terrytangyuan
Copy link
Member

Actually correct me if I am wrong about Postgres| MySql not being mandatory, if it is mandatory, makes more sense to just use that.

It's only required when persistence/workflow archiving is enabled.

Then there is also the mutex issue, which is ultimately the same problem.

Could you clarify what you meant by this? Any context?

I personally like the option of using badger because it doesn't require any architectural changes or requirements and I am fairly experienced in the use of these key value datastores since I have used RocksDB in the past when building a custom database.

We should stay away from any external dependencies like DB unless really necessary and leverage any cloud-native solutions whenever possible.

@juliev0
Copy link
Contributor

juliev0 commented Jun 23, 2023

So, this is related to re-generating information over and over, right? Is there a reason we prefer a ConfigMap over using the Workflow Status? Is it because the information is too verbose and more likely to hit the size limit?

@isubasinghe
Copy link
Member Author

@sarabala1979 this gives an example of why mutex release fails: https://github.com/argoproj/argo-workflows/blob/49865b1783b481ba7600441999559821d1a03a18/docs/proposals/mutex-improvements.md

I do not think the fix is that easy, the information available on acquire is different to the information available on release, that is why two functions exist for getting the key. The two functions actually happen to generate different keys, so you end up with the bug.

We probably can use a configmap for this particular issue because keys aren't going to be large and also keys will be released after the workflow is completed. Additionally a workflow being paused because it is waiting for space in the configmap is probably a better scenario than the mutex not really working.

also see the linked doc @terrytangyuan, it shows an example of the mutex failing, should be easy to understand.

@isubasinghe
Copy link
Member Author

We should stay away from any external dependencies like DB unless really necessary and leverage any cloud-native solutions whenever possible.

@terrytangyuan
I generally agree on this, but I think this is a tradeoff, storing items will give higher confidence than generating them.
I prefer the higher confidence even though I know introducing any sort of persistence on k8s is asking for pain (especially in the context of multi-zone clusters).

I do know a workaround for etcd that will allow us to store items of any size on it, it will require additional work on my end to do so though. We can essentially break a single entry into multiple keys and use etcd directly, this is safe because it supports transactional writes/reads. Obviously increases the maintenance cost on our end.

@isubasinghe
Copy link
Member Author

@juliev0 Yeah I believe this is correct, we generate information over and over to avoid dealing with configmap limitations (1MB).

@isubasinghe
Copy link
Member Author

isubasinghe commented Jun 24, 2023

As a concluding remark: I think we can improve the current re-generation of facts and reduce the number of these type of bugs.

However, I do not think we can reduce these bugs to an absolute zero, for example, the typescript and Go have their own implementations of node names.

We can eliminate this class of bugs from existing in the project entirely if we store them somewhere, to me that is the superior solution, despite that having many downsides to it as well (esp in the context of multi-zone deployments of k8s).

It's better to tackle the root problem, because I think it will bring greater stability to the project, fixing these sorts of bugs is frankly quite demoralizing (for me at least). And I am not sure the advantages brought by being stateless are worth the costs of being stateless, as evidence to this I will point to this issue: #8684

@alexec
Copy link
Contributor

alexec commented Jul 11, 2023

I agree with this. For pod names, we should store them in status.node[*].podName rather than looking at the annotation.

The pod name can be fixed when the NodeStatus is created. There is only one piece of code that does this, so it will be quite robust.

This has been the cause of a number (admittedly minor) annoying issues.

@alexec
Copy link
Contributor

alexec commented Jul 11, 2023

@JPZ13 do you want to create a separate issue for pod names?

@JPZ13
Copy link
Member

JPZ13 commented Jul 11, 2023

@alexec We're going to have @isubasinghe handle it holistically once he gets back from vacation

@aaron-arellano
Copy link

aaron-arellano commented Aug 21, 2023

adding my comment here from the other issue..

We use Argo heavily at ZG and when upgrading to v3.4 of argo workflows we noticed breaking changes this causes outside of the nodeId in the workflow. This v2 naming convention also breaks upstream k8s HOSTNAME env variable for the pod. For instance, we get the HOSTNAME in the workflow pod and run the kubernetes api call to get_namespace_pod with that HOSTNAME the pod name returned from the kubernetes api server does not match the pod name in the actual pod metadata.name. Not sure if there is some weird concatenation going on that is not persisting to etcd but the downard API does not match what is in etcd. I reverted the env var POD_NAMES to v1 and everything works in v3.4. I feel with all the bugs this v2 pod name should be reverted becuase the scope of breaking changes persists beyond argo itself and into kubernetes.

on another note, suggest using labels to store pod <-> workflow metadata instead of trying to generate a name that breaks so many scenarios in and out of argo. Labels would be a cleaner approach and will still be searchable by anyone using Argo both via cli or the Ui.

@JPZ13
Copy link
Member

JPZ13 commented Aug 22, 2023

@aaron-arellano could you post the pod name to host name mappings that you're seeing? I've also sent you an email and copied @isubasinghe. If you would prefer to keep those values private, you can share with us via email. This will help clue us in to concatenation issues

@aaron-arellano
Copy link

aaron-arellano commented Aug 23, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants