-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade Argo to v2.11+ #4553
Comments
@Ark-kun This is currently blocking my current work. Though I can work around it somehow, I would like to get this addressed quickly. Do you have a timeline for this to be fixed or if I can help with the upgrades? |
Will the upgrade need upgrading argo client? If not, I think you can try upgrading the argo installation in your own cluster and see if it fixes the problem. |
Upgrading the client is a lot harder, there's some go module dependencies issue to fix. There's an ongoing PR working on this, you may help there: #4498. |
It seems like the KFP images only include additional licenses on top of the original Argo images. If there are only additional lincenses, I can switch the image to the official argo images to see if it works. |
Yes, it's just additional licenses. You can just switch to official argo images. |
@Bobgy I tried to update the Argo image to v2.11.1, but now I get this error from the workflow-controller repeatedly, and new pipeline runs seem to get into an infinite loop with unknown status. Any ideas? time="2020-10-02T16:27:22Z" level=info msg="config map" name=workflow-controller-configmap
time="2020-10-02T16:27:22Z" level=info msg="Configuration:\nartifactRepository:\n archiveLogs: true\n s3:\n accessKeySecret:\n key: accesskey\n name: mlpipeline-minio-artifact\n bucket: parala-kfp-artifacts\n endpoint: minio-service.kubeflow:9000\n insecure: true\n keyPrefix: artifacts\n secretKeySecret:\n key: secretkey\n name: mlpipeline-minio-artifact\nexecutorImage: gcr.io/ml-pipeline/argoexec:v2.7.5-license-compliance\nmetricsConfig: {}\nnamespace: kubeflow\nnodeEvents: {}\npodSpecLogStrategy: {}\nsso:\n clientId:\n key: \"\"\n clientSecret:\n key: \"\"\n issuer: \"\"\n redirectUrl: \"\"\ntelemetryConfig: {}\n"
time="2020-10-02T16:27:22Z" level=info msg="Persistence configuration disabled"
time="2020-10-02T16:27:22Z" level=info msg="Starting Workflow Controller" version=v2.11.1
time="2020-10-02T16:27:22Z" level=info msg="Workers: workflow: 32, pod: 32"
time="2020-10-02T16:27:22Z" level=info msg="Performing periodic GC every 5m0s"
time="2020-10-02T16:27:22Z" level=info msg="Persistence disabled - so archived workflow GC disabled - you must restart the controller if you enable this"
time="2020-10-02T16:27:22Z" level=info msg="Starting workflow TTL controller (resync 20m0s)"
time="2020-10-02T16:27:22Z" level=info msg="Starting prometheus metrics server at localhost:9090/metrics"
time="2020-10-02T16:27:22Z" level=info msg="Starting CronWorkflow controller"
E1002 16:27:22.063415 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.8/tools/cache/reflector.go:105: Failed to list *unstructured.Unstructured: workflowtemplates.argoproj.io is forbidden: User "system:serviceaccount:kubeflow:argo" cannot list resource "workflowtemplates" in API group "argoproj.io" at the cluster scope
time="2020-10-02T16:27:22Z" level=info msg="Started workflow TTL worker"
E1002 16:27:23.068520 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.8/tools/cache/reflector.go:105: Failed to list *unstructured.Unstructured: workflowtemplates.argoproj.io is forbidden: User "system:serviceaccount:kubeflow:argo" cannot list resource "workflowtemplates" in API group "argoproj.io" at the cluster scope
E1002 16:27:24.073525 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.8/tools/cache/reflector.go:105: Failed to list *unstructured.Unstructured: workflowtemplates.argoproj.io is forbidden: User "system:serviceaccount:kubeflow:argo" cannot list resource "workflowtemplates" in API group "argoproj.io" at the cluster scope
E1002 16:27:25.078845 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.8/tools/cache/reflector.go:105: Failed to list *unstructured.Unstructured: workflowtemplates.argoproj.io is forbidden: User "system:serviceaccount:kubeflow:argo" cannot list resource "workflowtemplates" in API group "argoproj.io" at the cluster scope |
Solved: adding |
@Bobgy There are some changes needed to be made to use the latest version of Argo |
@xinbinhuang You can try if upgrading argo itself solves your problem, if it passes all of our e2e tests, we can get it merged before the cli client. |
If we like to update the Argo version to 2.11.X I can look in to it, I guess it will be kind of similar to the update to 2.7 @Bobgy? |
/assign |
@NikeNano thank you for offering help! That'll be great! What's even better is using the chance to document how to upgrade argo, so others can learn from you next time. |
Sounds like a good idea, will include it! |
FYI, when upgrading to 2.11.6, you should be aware that Google requires all images to contain necessary license information in the docker image. I think we can split into two PRs, one upgrading the image and one upgrading the go package. EDIT: I built https://github.com/kubeflow/testing/tree/master/py/kubeflow/testing/go-license-tools to automatically collect go dependency licenses from GitHub. |
Cool, I will look in to it when I managed to fix the dependencies correctly. |
Sorry, the issue drifted away from my focus previously. I managed to switch over to the official argo version 2.11.6 server side for my deployment, and everything has been running smoothly. It seems that server side is more straightforward. @NikeNano Have you started on this? If not, I can create a PR tonight to summarize what I did and you can look into it and include extra depedencies and licenese as such. |
I have done some initial work but make a PR with your solution @xinbinhuang, and I can help out :) |
FYI related work on argo to update the dependencies : argoproj/argo-workflows#4426 |
This was just merged to argo, argoproj/argo-workflows#4810 (comment), will start to look at it as well to see if we could push this. This will be part of argo v3 |
@NikeNano Is there any ETA on this? And what are the remaining items of upgrading Argo to v2.11+? Do we have a task list for upgrading Argo to v2.11+? |
If we want to go for version3 we have to wait for the release until we can update as far as I see. Which should be in the end of January hopefully, argoproj/argo-workflows#4425 (comment). I guess this might not be necessary, but last time I looked at it there where some dependencies issues that I could't solve with out the need for upgrading argo, maybe you could figure it out @capri-xiyue? When I did the update to 2.7 their where a lot of issues with collision between dependencies. I suggest we wait for the release before we try to do the update. |
I think it will be fine to wait until the end of Jan for argo version3 if it makes the updating dependency easier. |
Thanks, makes sense to me waiting for argo v3 if that' end of Jan.
We do not usually update dependencies, so each time we update, they have already been pretty old and many things could be breaking after some dependencies are updated. It's worth discussing the upgrade strategy in a project health issue. |
@capri-xiyue I am pretty new to go in general and have always found go dependency management to be a bit unclear. Especially how it sometimes tries to automatically resolve issues.... see https://github.com/golang/go/wiki/Modules#can-i-control-when-gomod-gets-updated-and-when-the-go-tools-use-the-network-to-satisfy-dependencies. I think it would be great if we document this, which I remember you also asked for as part of the upgrade of argo @Bobgy. But lets make a seperate issue and continue the discussion on. |
Progress for argo v3: https://github.com/argoproj/argo/milestone/20 |
Asked about upstream updates in argoproj/argo-workflows#4953 EDIT: got reply from argo maintainer, the suggestion is to upgrade to v2.12 now, v3 will be backward compatible, but it'll still take a while, the first RC hasn't been released yet (but will soon). |
I will give it a new try to update to 2.12. |
Let us know if you want help! |
* tests * added go mod file * updated go.mod * argo latest stable * upgrade argo * clean up * go mod tidy to clean up * fixed test after backend * go mod tidy clean up * more clean up * added helper function and updated after feedback * updated k8s.io/kubernetes to version 0.17.9 * updated go dependencies
Resolved by #5232. Thank you for everyone who helped with this issue! |
Issues fixed:
ParallelFor(results.ouput)
when results is a list of objects #4551Improvements:
The text was updated successfully, but these errors were encountered: