Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default initialised NodeStatus causes random false failures. #11102

Closed
3 tasks done
isubasinghe opened this issue May 19, 2023 · 4 comments · Fixed by #11451
Closed
3 tasks done

Default initialised NodeStatus causes random false failures. #11102

isubasinghe opened this issue May 19, 2023 · 4 comments · Fixed by #11451
Assignees
Labels
area/controller Controller issues, panics P3 Low priority type/bug

Comments

@isubasinghe
Copy link
Member

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

When a nodeID is missing from the NodeStatus hashmap, this should cause an error and that condition should be handled.
Unfortunately thanks to Go's design of maps, a default initialised value of a NodeStatus is returned instead of panicking (which is arguably the sane thing to do). Instead of assuming a value for the NodeStatus, either mark that node as an error or wait till the field has been populated.

Version

:latest

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

N/A

Logs from the workflow controller

N/A

Logs from in your workflow's wait container

N/A
@isubasinghe isubasinghe changed the title Default initialised NodeStatus causes spurious failures. Default initialised NodeStatus causes random false failures. May 19, 2023
@sarabala1979 sarabala1979 added the P3 Low priority label Jun 1, 2023
@stale

This comment was marked as resolved.

@stale stale bot added the problem/stale This has not had a response in some time label Jun 18, 2023
@isubasinghe

This comment was marked as resolved.

@stale stale bot removed the problem/stale This has not had a response in some time label Jun 20, 2023
@caelan-io
Copy link
Member

@isubasinghe - qq: does this fall under the larger effort in #10267 or is this separate?

@isubasinghe
Copy link
Member Author

This is a separate issue, one of the container set related issues.
I need to open more issues for the other container set PRs

terrytangyuan pushed a commit that referenced this issue Aug 1, 2023
Signed-off-by: Isitha Subasinghe <isitha@pipekit.io>
Signed-off-by: isubasinghe <isitha@pipekit.io>
dpadhiar pushed a commit to dpadhiar/argo-workflows that referenced this issue May 9, 2024
Signed-off-by: Dillen Padhiar <dillen_padhiar@intuit.com>
@agilgur5 agilgur5 added the area/controller Controller issues, panics label Jun 18, 2024
@agilgur5 agilgur5 added this to the v3.4.x patches milestone Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controller Controller issues, panics P3 Low priority type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants