-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure single source of truth for generated values. #10267
Comments
@isubasinghe can we have a holistic approach to solve the POD_NAME issue for all scenarios? Can you propose the solutions? |
@sarabala1979 we really need some data store, configmaps can be used but obviously have the 1MB limit imposed by etcd. Even dealing with the configmap limit is still a better option than what we are doing now, because there are quite a lot of bugs relating to the node names alone. Then there is also the mutex issue, which is ultimately the same problem. Either we make mysql or postgres mandatory and we rely on that to store everything or we use something like this: The way badger works and the way our workflows work, I do not foresee any scaling problems running this as an embedded datastore. I personally like the option of using badger because it doesn't require any architectural changes or requirements and I am fairly experienced in the use of these key value datastores since I have used RocksDB in the past when building a custom database. That is my conclusion, use badger or (postgres|mysql) and store everything we need in it. I do not think patching these bugs as they appear is a good solution, we need to extinguish this problem at the source. |
Also pinging @terrytangyuan and @juliev0 so that they can have a read. |
Actually correct me if I am wrong about Postgres| MySql not being mandatory, if it is mandatory, makes more sense to just use that. |
Can you summarize Mutex issues? is it because of node id/name mismatch causing the issue or anything else? |
It's only required when persistence/workflow archiving is enabled.
Could you clarify what you meant by this? Any context?
We should stay away from any external dependencies like DB unless really necessary and leverage any cloud-native solutions whenever possible. |
So, this is related to re-generating information over and over, right? Is there a reason we prefer a ConfigMap over using the Workflow |
@sarabala1979 this gives an example of why mutex release fails: https://github.com/argoproj/argo-workflows/blob/49865b1783b481ba7600441999559821d1a03a18/docs/proposals/mutex-improvements.md I do not think the fix is that easy, the information available on acquire is different to the information available on release, that is why two functions exist for getting the key. The two functions actually happen to generate different keys, so you end up with the bug. We probably can use a configmap for this particular issue because keys aren't going to be large and also keys will be released after the workflow is completed. Additionally a workflow being paused because it is waiting for space in the configmap is probably a better scenario than the mutex not really working. also see the linked doc @terrytangyuan, it shows an example of the mutex failing, should be easy to understand. |
@terrytangyuan I do know a workaround for etcd that will allow us to store items of any size on it, it will require additional work on my end to do so though. We can essentially break a single entry into multiple keys and use etcd directly, this is safe because it supports transactional writes/reads. Obviously increases the maintenance cost on our end. |
@juliev0 Yeah I believe this is correct, we generate information over and over to avoid dealing with configmap limitations (1MB). |
As a concluding remark: I think we can improve the current re-generation of facts and reduce the number of these type of bugs. However, I do not think we can reduce these bugs to an absolute zero, for example, the typescript and Go have their own implementations of node names. We can eliminate this class of bugs from existing in the project entirely if we store them somewhere, to me that is the superior solution, despite that having many downsides to it as well (esp in the context of multi-zone deployments of k8s). It's better to tackle the root problem, because I think it will bring greater stability to the project, fixing these sorts of bugs is frankly quite demoralizing (for me at least). And I am not sure the advantages brought by being stateless are worth the costs of being stateless, as evidence to this I will point to this issue: #8684 |
I agree with this. For pod names, we should store them in The pod name can be fixed when the This has been the cause of a number (admittedly minor) annoying issues. |
@JPZ13 do you want to create a separate issue for pod names? |
@alexec We're going to have @isubasinghe handle it holistically once he gets back from vacation |
adding my comment here from the other issue..
on another note, suggest using labels to store pod <-> workflow metadata instead of trying to generate a name that breaks so many scenarios in and out of argo. Labels would be a cleaner approach and will still be searchable by anyone using Argo both via cli or the Ui. |
@aaron-arellano could you post the pod name to host name mappings that you're seeing? I've also sent you an email and copied @isubasinghe. If you would prefer to keep those values private, you can share with us via email. This will help clue us in to concatenation issues |
So here is an example of the error I saw with pod v2 name. You can see in the log file attached we call the kubernetes python library function read_namespaced_pod. in the workflow itself we call the below code:
```python
current_pod_name = os.environ.get("HOSTNAME", None)
current_pod_namespace = os.environ.get("POD_NAMESPACE", None)
if current_pod_name and current_pod_namespace:
return self.v1.read_namespaced_pod(
namespace=current_pod_namespace, name=current_pod_name
)
```
HOTSNAME as you know is derived from default kubernetes API functionality in the container to the pod it is hosted on, ref<https://kubernetes.io/docs/concepts/containers/container-environment/#container-information>. With pod name v1 everything works fine but on v2 name this fails with the logs attached. The actual pod name was aip-toleration-integration-test-zlntf-nvidia-tesla-v100-4115184276
when getting HOSTNAME from within the v2name pod we see the last 3 chars of the real pod name are not there. This consistenly happens when running get_namespaced_pod in a workflow pod generated under the v2 name feature. Again, when I revert POD_NAMES in workflow-controller and arg-server to v1 this scenario works fine.
Container Environment<https://kubernetes.io/docs/concepts/containers/container-environment/#container-information>
This page describes the resources available to Containers in the Container environment. Container environment The Kubernetes Container environment provides several important resources to Containers: A filesystem, which is a combination of an image and one or more volumes. Information about the Container itself. Information about other objects in the cluster. Container information The hostname of a Container is the name of the Pod in which the Container is running. It is available through the hostname command or the gethostname function call in libc.
kubernetes.io
|
Summary
We need to ensure that when we generate facts, say for example the pod name, we only do so once.
Use Cases
This would significantly reduce the number of bugs we have.
For example, when investigating this issue: #10124
Something I noticed was that the output between
./argo get -json
and the UI were slightly different.I've had similar issue with a Mutex issue linked here.
This generation of facts every time we need them has been quite problematic in my opinion and we should refactor the codebase to eliminate this behaviour. I believe doing so will eliminate an entire class of bugs from our codebase.
The text was updated successfully, but these errors were encountered: