-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal to make enterprise use the latest core code at all times #62
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,169 @@ | ||
# 12. Couple the Enterprise and OSS development process | ||
|
||
Date: 2022-08-12 | ||
|
||
## Status | ||
|
||
Proposed | ||
|
||
## Context | ||
|
||
Weave Gitops OSS and Weave Gitops Enterprise are in many ways set up | ||
as though they are two independent projects - Enterprise uses stable | ||
releases of OSS, much like it uses stable releases of Flux or Cobra. | ||
|
||
However, that doesn't reflect the reality. The reality is that they | ||
are two tightly coupled projects, in several ways. For one thing, | ||
there is a lot of library code in OSS that Enterprise uses - and maybe | ||
there's scope for this to grow. For another thing, there's an | ||
increasing commercial desire to be able to move capabilities between | ||
the two codebases cheaply. | ||
|
||
This disconnect between the architecture and engineering reality | ||
wastes time and makes both stakeholders and engineers frustrated. It's | ||
difficult to quantify the exact amount of time wasted, so this instead | ||
describes two scenarios that both are common, and are healthy in a | ||
product before version 1.0. | ||
|
||
The first scenario is an Enterprise developer who spots a JS | ||
component in OSS that just need a one line fix to be perfect for the | ||
developer's current ticket. They are faced with two choices: | ||
|
||
* Either they have to make a pull request, then wait for the next time | ||
OSS makes a release, and then finish their ticket. | ||
* Or, they have to copy-paste the whole component into a "vendored" | ||
one. | ||
|
||
The other scenario is a developer working on an OSS feature that uses | ||
shared code with Enterprise wants to make an API change. They have | ||
three options: | ||
|
||
* They can spend time making it backwards compatible so Enterprise doesn't | ||
need to change, leaving 2 implementations of the feature inside OSS | ||
* They can try to remember to submit a pull request to Enterprise | ||
the next time OSS makes a release | ||
* They can just leave it for the Enterprise developers to figure out | ||
and fix. | ||
|
||
In both cases, the developer is faced with the choice to accrue tech | ||
debt at a frightening rate, or slow down the Enterprise process. | ||
|
||
## Decision | ||
|
||
We will accept the fact that the two code bases are so tightly | ||
coupled, and embrace it. | ||
|
||
The enterprise `main` branch will start tracking the OSS `main` | ||
branch, in both javascript and go. This will be accomplished with a | ||
github action that creates or updates a PR in Enterprise with the | ||
latest OSS main, which kicks off Enterprise's CI, and if it fails the | ||
developer who changed OSS will be notified and be able to fix the | ||
breakage directly, while the context is still in their brain. | ||
|
||
This tries to illustrate the proposed flow when OSS Oscar works on a | ||
new feature that changes APIs that Enterprise is using - when Enterprise | ||
Enya next wants to upgrade OSS, Oscar has already pushed an API migration. | ||
```mermaid | ||
sequenceDiagram | ||
participant Oscar | ||
participant OSS as Gitops OSS | ||
participant Enterprise as Gitops Enterprise | ||
participant Enya | ||
Oscar->>OSS: Make PR with new feature | ||
OSS->>Oscar: Success ✅ | ||
Oscar->>OSS: Merge | ||
OSS->>Enterprise: Open PR to bump OSS | ||
Enterprise->>Oscar: Build failed ⛔ | ||
Oscar->>Enterprise: Use new feature | ||
Enterprise->>Oscar: Success ✅ | ||
Enya->>Enterprise: Approve ✅ | ||
Oscar->>Enterprise: Merge | ||
``` | ||
|
||
This means that OSS's release process needs to similarly kick off an | ||
Enterprise release process - when OSS forks a branch to release, that | ||
triggers another action that does the same for Enterprise. When OSS | ||
decides to approve the release, the Enterprise PR is updated with the | ||
stable tags for OSS. | ||
|
||
This tries to illustrate the proposed release process - as it is | ||
Enterprise that depends on OSS, Oscar must launch the process, but as | ||
soon as it's begun Enya is able to start testing the release in | ||
Enterprise. When OSS has finished publishing its release, Enya is able | ||
to just release the PR that OSS generated automatically. | ||
```mermaid | ||
sequenceDiagram | ||
participant Oscar | ||
participant OSS as Gitops OSS | ||
participant Enterprise as Gitops Enterprise | ||
participant Enya | ||
Oscar->>OSS: Kick off release process | ||
OSS->>OSS: Opens PR with updated versions | ||
OSS->>Enterprise: Make release test PR to update versions | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if we'd need this one as we'd hopefully already be up to date? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤔 Yes. Probably? My thinking was, OSS creates a branch But you seem to be thinking of it the other way round (release from main, stack changes in PRs), which I don't have a problem with either. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah, that is just habit / vibe. Feels like people might forget to merge to release branch and so we'll need to cherry pick / manage a bit more. vs. devs having to base off |
||
Oscar->>OSS: Approve ✅ | ||
Enya->>OSS: Approve ✅ | ||
OSS->>OSS: Merge PR and publish release | ||
OSS->>Enterprise: Make a PR to update OSS, make release | ||
Enya->>Enterprise: Make further changes | ||
Enya->>Enterprise: Approve ✅ | ||
Enterprise->>Enterprise: Release & merge | ||
``` | ||
|
||
In both cases, the automation doesn't go as far as to change | ||
Enterprise - it will merely make pull requests for human review. | ||
|
||
## Consequences | ||
|
||
This is intended to be a pragmatic decision that can be implemented | ||
very quickly, that helps unblock our developers today. It recognizes | ||
that the boundary between OSS and Enterprise isn't technical but | ||
rather driven by commercial strategy, and as a result isn't sitting | ||
right for the engineers working in it every day. It's very easy to | ||
come up with multiple proposals that are better, however those | ||
proposals would take more work, time, and decisions - this proposal | ||
can be implemented now, and would let most developers continue working | ||
the way they do today, just faster. | ||
|
||
The main upside is that changes in OSS will be available to use in | ||
Enterprise in about 15 minutes, compared to weeks today. This is the | ||
kind of improvement that's so big that it's not just about "closing | ||
tickets faster" - it would get rid of entire classes of problems | ||
and sources of frustration to do with migrations and API stability. If | ||
a developer wants to change an API that's used in both, they can fix | ||
both and have them merged before lunch. | ||
|
||
As a result, this would bring some of the benefits we could get from | ||
setting up a monorepo for both OSS and Enterprise, but this could be | ||
done very quickly, with very little changes to existing workflows. | ||
|
||
One downside is that this holds the developer working on OSS | ||
responsible for making sure their changes work with Enterprise. This | ||
is a little bit of a roadblock for developers primarily working on | ||
OSS, however it's not unreasonable to ask developers to make sure the | ||
product still works with their changes - you wouldn't expect a | ||
developer primarily working on backend to merge API changes that break | ||
the frontend and move on without further conversations. | ||
|
||
The biggest challenge is that it would couple releases between OSS and | ||
Enterprise. Enterprise's main branch today can always be released | ||
using a stable OSS release - with this change, Enterprise would have | ||
to choose between releasing with an unstable OSS version, roll back or | ||
backport changes so they work with the last stable OSS release, or ask | ||
OSS to make a stable release. However, it's worth pointing out that | ||
this won't prevent Enterprise from releasing a customer-specific | ||
release - just that the cost of doing that would be slightly | ||
higher in that it needs to verify both OSS and Enterprise | ||
functionality. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, a potential interim solution might be to update an existing PR in this case: Scenario:
The PR 1234 would basically become a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, if there's a "recent-ish" WG release, it's all good. I was thinking specifically of if WG's cadence is out of step with WGE, so if WG can't release fast enough for $quality_reasons and WGE can't wait for $customer_reasons, then WGE is a bit stuck. But perhaps that's mainly a problem if there's missing communication and inconsistent quality thresholds. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, there is a need for more co-ord in the WGE teams too, if we have a predictable 2 week cadence too each team can plan their manual release testing etc a bit better. So if we align w/ WG's cycle that would work well.. |
||
|
||
However, these release challenges are a new iteration of an old pain | ||
point - both Enterprise and OSS are lacking process for longer-lived | ||
stable branches that are separate from the main feature development | ||
branches so any release includes anything that has landed in main. | ||
It is assumed that both projects will need to solve that. Discussions | ||
about how to do that are already happening, but are out of scope or | ||
this ADR. Until then, both OSS and Enterprise should agree to try to | ||
do a stable release every 2 weeks, whether there are new features or | ||
not, as long as there's no blocking bugs. This synchronization only | ||
happens between the release management function in the respective | ||
project, instead of all developers being subject to this | ||
synchronization overhead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as suggested in slack https://weaveworks.slack.com/archives/C03QNK53W68/p1661252951461039?thread_ts=1660574525.864539&cid=C03QNK53W68
why dont we have it a go using a branch that follows main
main-oss
using the same process but adding a step of
main
tomain-oss
Some metrics we could gather to determine whether is a good move are