-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cannot retry large archived workflow that needs offloading #12740
Comments
This means your object is larger than 1MB. https://github.com/kubernetes/kubernetes/blob/db1990f48b92d603f469c1c89e2ad36da1b74846/test/integration/master/synthetic_master_test.go#L315 We have encountered similar problems before, "Request entity too large: limit is 3145728" |
@shuangkun |
Yes, we can make some related improvements to see if it can work in a large workflow. |
@shuangkun thanks, and I just give a draft PR, it may be too trick, but it work well for retry large archived wf. #12741 |
OK,I will take a look. Thanks! |
Signed-off-by: heidongxianhua <18207133434@163.com> Signed-off-by: xiaowu.zhu <xiaowu.zhu@daocloud.io>
Signed-off-by: heidongxianhua <18207133434@163.com>
Pre-requisites
:latest
What happened/what did you expect to happen?
when I retry a large archived workflow (which is failed), it can not retry because of the server error.
![image](https://private-user-images.githubusercontent.com/22924371/310097022-e2b65621-2ffe-4296-a943-a1251a4cc0b9.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk2NzY5MjQsIm5iZiI6MTczOTY3NjYyNCwicGF0aCI6Ii8yMjkyNDM3MS8zMTAwOTcwMjItZTJiNjU2MjEtMmZmZS00Mjk2LWE5NDMtYTEyNTFhNGNjMGI5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE2VDAzMzAyNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWVjMjAwNzIzY2Q4ODcyNGYzYTRiZjUwNDY5NmRhZTAyNWYyMGRmY2FjZTNlNjA1OTU5N2IwODk1OTlhNmQyZjMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.uy2BdbYEKDbYXOlQrYBGKXqXk3vXC-tVsgQUQ-pnJTI)
And I have set offloadNodeStatus=true in configmap, due to the related code, the
offloadNodeStatus
is only valid for the workflow which are not archived. As for archived workflow, it is invalid, the workflow will get all infos from nodeOffloadRepo if need and save it to argo_archived_workflows table, then when retry this workflow, it create a new workflow withcomplete information not using nodeOffloadRepo, so it will be failed.
Version
V3.5.0
Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
Logs from the workflow controller
argo server:
Logs from in your workflow's wait container
The text was updated successfully, but these errors were encountered: