Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Project Health] Dependency upgrade process #4682

Closed
Bobgy opened this issue Oct 28, 2020 · 65 comments
Closed

[Project Health] Dependency upgrade process #4682

Bobgy opened this issue Oct 28, 2020 · 65 comments

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Oct 28, 2020

KFP has many dependencies like

  • library versions (e.g. tfx, mlmd)
  • images (e.g. dockerhub/python, alpine, ...)

Benefits of updating these on a timely manner:

  • Upstream image updates likely remove many vulnerabilities that we will need to investigate manually each time later
  • We need to keep in sync with tfx/mlmd etc to know problems early.

We could stop pinning image versions to try getting above benefits, but this approach makes KFP repo non-deterministic. Tests may start to fail suddenly with no clue when some images released a new tag.

The best long term solution I can imagine is:

  1. build a script that programmatically update a dependency's version in KFP repo, we can use this to ease manual update efforts.
  2. build a bot that periodically run above script and send a PR to KFP repo, admins can approve these PRs if they pass presubmit tests.

We'll likely need to set this up for each type of dependency.
A script that programmatically update a dependency is of higher priority, because so far update efforts have been highly manual and time-costing.

In this approach,

  • there's minimal ongoing efforts needed to keep images up-to-date
  • tests/builds won't suddenly fail without any clue
  • if there are incompatibility issues with a dependency's new version, we can discover it early
@Bobgy Bobgy changed the title Upgrade google/cloud-sdk image to 235.0.0 Upgrade google/cloud-sdk image Oct 28, 2020
@Bobgy Bobgy changed the title Upgrade google/cloud-sdk image Upgrade google/cloud-sdk image process Oct 28, 2020
@Bobgy Bobgy changed the title Upgrade google/cloud-sdk image process [Project Health] Dependency upgrade process Oct 28, 2020
@Bobgy
Copy link
Contributor Author

Bobgy commented Oct 28, 2020

I think docker image update is a good first candidate, because it is of low possibility to break us on upgrade, also we need the timely vulnerability fixes coming from it.

@capri-xiyue
Copy link
Contributor

capri-xiyue commented Jan 19, 2021

There are several main areas that might need updating:

  • Python SDK deps
  • Backend Go deps
  • Kubernetes deps (e.g. Argo, Istio)
  • Deps of micro services (Metadata Writer, Visualization Server, etc.)
  • frontend deps
  • MLMD client/server version

For python (SDK and Visualizations) we already have update scripts.

@davidspek
Copy link
Contributor

davidspek commented Jan 20, 2021

@Bobgy Regarding the docker tags, it seems dependabot can already do this.
https://dependabot.com/blog/dependabot-now-supports-docker/

Edit:
It seems it is also already setup to do so, see #4962, but Maybe some settings can be optimized as dependabot limits the amount of PRs it creates which I think should be higher than the default (5 I believe).

I just saw that the dependency was in a *.py file. In that case it is a question of setting up dependabot to scan docker files as well.

@davidspek
Copy link
Contributor

davidspek commented Jan 20, 2021

@Bobgy I've created a script that will scan the repository for files named *ockerfile* but skipping /components/deprecated*, package*.json but skipping /*node_modules* and *requirements.txt. It then goes on to create the yaml file for dependabot so it scans the correct directories. It is setup for dockerfiles, npm, gomod and python at the moment, and I believe this should cover almost all the code in the repo. It is trivial to further customize what folders are selected if further customization is needed.

As it stands now, there are about 130 PRs that will be created with this configuration, so it might be advisable to have some form of plan to implement it in stages or be ready to quickly go through lots of the PRs. Another option is to create a target branch for all these PRs so they can be merged into that first rather than master.
#5015

@davidspek
Copy link
Contributor

For reference, the PRs that will be created can be seen here: https://github.com/DavidSpek/pipelines/pulls

@davidspek
Copy link
Contributor

To add to the above PRs, by default the dependabot security alerts can only be seen by repo admins. While I understand it might not be desirable to have this information be public (although all it takes is forming the repo to find out), I do think it is important that WG members are able to see the security alerts so they can quickly see what dependabot PRs have a high priority for being merged. I'm not sure how this would work wrt https://github.com/kubeflow/internal-acls, but here are the instructions to do it through the UI:

https://docs.github.com/en/github/administering-a-repository/managing-security-and-analysis-settings-for-your-repository#granting-access-to-security-alerts

@davidspek
Copy link
Contributor

@Bobgy, I could be wrong but I believe my PR covers the entire scope of what was described in the original post. If you agree I will change the PR so that it links to this issue and can close it when it gets merged.

@Bobgy Bobgy mentioned this issue Feb 2, 2021
67 tasks
@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 2, 2021

The dashboard is live!
#5068

@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 2, 2021

A remaining configuration thing to discuss is, do we want to pin our dependencies?
https://docs.renovatebot.com/dependency-pinning/

There are two approaches:

  1. package.json keeps using semver versions, renovate only updates package-lock.json
  2. We pin all direct dependencies in package.json to the exact versions like in chore(deps): pin dependencies - autoclosed #5069, renovate will update both package.json and package-lock.json.

EDIT: if we want to choose 1, we should enable :preserveSemverRanges config: https://docs.renovatebot.com/presets-default/#preservesemverranges.
If we want to choose 2, that's the default behavior.

I tend to stay with 1 first, because that's our existing approach.

@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 2, 2021

Hmm, something doesn't look right. After making this configuration only webdriverio/wdio-selenium-standalone-service, webpack/webpack were updated in npm patch PR.

Revising again based on https://docs.renovatebot.com/configuration-options/#rangestrategy

@rarkins
Copy link
Contributor

rarkins commented Feb 2, 2021

@Bodgy please see #5075

@rarkins
Copy link
Contributor

rarkins commented Feb 2, 2021

FWIW I recommend against retaining ranges in package.json unless you are publishing a library. By retaining ranges you don't save any noise to the repo (need just as many updates) but it's harder to tell what version you are currently on or what is being upgraded, because you have to dig through hundreds or thousands of lines of a lock file instead of dozens in a package.json.

@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 2, 2021

@rarkins I can see your point, for clarification, we will unnecessarily have specific reasons to stick to certain version of a library for some time.

I'd hope such a decision can be represented in a form not coupled to renovate, so that e.g.

  1. we can still upgrade dependencies in the usual way -- using npm update
  2. keep source of truth in package.json

If we pin all versions in pakcage.json and configure package specific no update rules in renovate.json, that sounds like too aggressive for adopting a new tool.

@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 2, 2021

In fact, even with rangeStrategy=update-lock, version update is still very visible thanks to Renovate: #5071.
image

@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 19, 2021

TODO: set up a process so people periodically come and fix upgrade problems.
Right now, the npm patch update PR #5071 fails because of some typescript errors after upgrade. Without a process, upgrade PRs will never be prioritized and merged.

@davidspek
Copy link
Contributor

@Bobgy Might I suggest pinning the dependency dashboard issue so that it doesn't get lost between the countless new issues that are created? Related to that, there is another issue that is now pinned that has been closed so it can probably be unpinned. #5020

@davidspek
Copy link
Contributor

@Bobgy Any update how Renovate has been working out? Or has everybody been to busy with the upcoming release to go through the dependency updates?

@Bobgy
Copy link
Contributor Author

Bobgy commented Mar 18, 2021

Yes, everyone has been busy.
I tried the batch patch upgrade, but some typescript changes need manual fix. So I was stuck there.

Will try to resume after release.

Anyway, I feel that the dashboard is very helpful. It allowed us to integrate at our own pace.

@davidspek
Copy link
Contributor

Is there a way to allow repository owners (or rather people listed in the OWNERS file) to edit issues? For kubeflow/kubeflow the Dependency Dashboard would only serve as an overview, but the ability to need to approve a PR in the dashboard before it is created is lost otherwise.

@Bobgy
Copy link
Contributor Author

Bobgy commented Mar 18, 2021

I think the plan is to move notebooks code to kubeflow/notebooks repo, where they can get write permission

@davidspek
Copy link
Contributor

davidspek commented Mar 18, 2021

Great! The way I had envisioned the management of the notebook images relies heavily on Renovate and setting that up without the Dependency Dashboard approval feature complicates things a lot as you can image.

@stale
Copy link

stale bot commented Jun 18, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 18, 2021
@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 19, 2021

/lifecycle frozen

@google-oss-robot google-oss-robot added lifecycle/frozen and removed lifecycle/stale The issue / pull request is stale, any activities remove this label. labels Jun 19, 2021
@rimolive
Copy link
Member

Closing this issue. No activity for more than a year.

/close

Copy link

@rimolive: Closing this issue.

In response to this:

Closing this issue. No activity for more than a year.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Closed
Development

No branches or pull requests

7 participants