Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

SEP 14 - Changes to branching and release strategy #20

Merged
merged 3 commits into from
Nov 13, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 154 additions & 0 deletions dev-overhaul.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
- Feature Name: Improved Salt Testing and Release Strategy
- Start Date: 2019-10-01
- SEP Status: Draft
- SEP PR: (leave this empty)
- Salt Issue: (leave this empty)

# Summary
[summary]: #summary

The purpose of this SEP is to define an overhaul of the Salt test, development, and release process. Also, read [the FAQ](https://docs.google.com/document/d/1DfOGVmsQaqr3rZrFqo-gf9wm1mYM-nwa5a1zROatCXQ/edit#heading=h.8yucbt63lrhh) after reading this SEP.

# Motivation
[motivation]: #motivation

Over the course of time, Salt releases have gotten increasingly complex and release timelines have become unpredictable and unstable. As the Salt core team spent a tremendous effort to stabilize the test suite for the 2019.2.1 release, a large number of problems became apparent and were identified as release issues.

- Missed releases.
- No clear indication to Salt users about which features/fixes are part of which releases.
- Overhead of maintaining 7 branches (untenable/not a good long-term strategy).
- Reduced release quality.

We have missed release deadlines due to the complexity of the Salt release process. Prior to the 2019.2.1 release we were maintaining **seven** branches:

- `develop`
- `neon`
- `2019.2.1`
- `2019.2`
- `2018.3`
- `2017.7`
- `2017.7.9`

![Existing strategy](diagrams/diagrams_old-branch-strategy.png)

The combination of bugfixes, merge forwards, and backports has been difficult to try and keep track of. There have been a number of Salt releases the community has met with confusion - why is this bugfix missing? Why was that feature added in a point release?

It’s also been difficult for the community to contribute, raising questions such as:

- If I want my changes in the next Salt release, what branch should I make my PR against?
- How should I be testing?

In some cases contributors submitted their PRs only against the branch they cared about, and it was up to the Core team to ensure that these fixes or features made it to the other releases. This constantly shifting target produced a number of bugs. With a desire to merge PRs quickly to get fixes and features in, PRs were merged without tests, or with apparently unrelated test failures. When it came time for release, a tremendous amount of effort was required to fix failing tests and stabilize each branch. It was also difficult trying to ensure that all of the desired fixes, merge forwards, and backports were made before release.

During the 2019.2.1 release process it became apparent that our current approach is unsustainable and unacceptable. Our unique release process causes difficulty for contributors, is painful for users, and is impossible for maintainers.

**There must be a better way.**

The Salt core team has spent a tremendous amount of time, thought, and effort working to identify the issues that face our community and coming up with a plan for reducing the overhead and complexity to make it easier for Salt contributors to produce and ship high-quality software on time and to meet expectations. This proposal is the result of that process.

These changes are intended to:

- Improve core Salt stability.
- Improve testing quality.
- Reduce maintenance and testing overhead.
- Be able to release Salt on demand.
- Be able to announce Salt release dates with a high degree of certainty.
- Maintain a **high-frequency, reliable release cadence of stable software**.

# Design
[design]: #detailed-design

## Single Branch Release Strategy

To eliminate the biggest source of confusion and distraction, Salt will adopt the industry standard of a **single branch release** strategy. Salt will have a **master** branch that reflects production-ready code. This will avoid the confusion and overhead of which branch needs to be merged where. This will also allow Salt contributors to spend more time focusing on bug fixes and new features that are important to Salt users.

### New master branch

A new branch, called **master** will be created from **2019.2.1**. Salt core team has spent close to 5 months stabilizing the 2019.2.1 branch. Because 2019.2.1 is stable and tests are already green, rather than repeating the entire process for **develop** or some other branch, PRs from `develop` and `neon` branches will be merged in a controlled manner to ensure **master** (formerly 2019.2.1) continues to stay stable and green.

![A new, more stable approach](diagrams/diagrams_new-branch-strategy.png)

**Focusing on a single release branch will help us to release better software more often**. In order to maintain this focus, and to meet our desired release cadence **there will be no more point releases on 2017.7.x and 2018.3.x branches and 2017.7.8 and 2018.3.4 will be capstone releases**.

### Tests Must Be Green

The Salt core team has already been requesting regression tests for bug fixes, and appropriate quality test suites to accompany new features. The entire test suite runs for each PR and **MUST** pass before merge, as [outlined in SEP 10](https://github.com/saltstack/salt-enhancement-proposals/blob/master/accepted/0010-pr-merge-requirements.md). Not only must new tests be passing, but if a failure in existing tests is exposed, the Salt core team will fix the test suite before any new PRs are merged. This will produce a higher degree of certainty that the code that we release is stable, and that changes to Salt code do not introduce regressions.

### PR Migration plan

Any changes previously added to the 2017.7 and 2018.3 branches will be merged forward into the new master branch. Also there are currently about 1,000 PRs that have been merged into the 2019.2, neon, and develop branches which will get merged into **master**, but it’s going to take a lot of work. The Salt core team has prioritized these PRs in terms of criticality to the Salt community, and ease of migration. The Salt core team is carefully working to port these PRs into the new **master**, and has reached out to many of the top contributors with a large number of PRs to explain and review our plans.

While the Salt core team has committed to migrate the PRs, because of the huge size of the task before us, we would love your help! If you have a PR that you’d like to get merged more quickly, we’d love your help migrating PRs from other branches to **master**.

## Release Cadence

The Salt team has a goal to be able to release when necessary. Our plan is to have a typical release cadence of 3 months. We expect to have an intermediate release cadence of 4 months. To be able release on demand, the Salt open team needs to:

- Always maintain green tests for release branch.
- Ensure stability of release branch all the time.
- Adopt a 100% automated CI/CD release model.

The Salt core team has a goal to be on a 4 month release cadence for the next few releases, with the ultimate goal of being able to release whenever needed. While we plan to stabilize on a 3-month typical release cadence, the fact of software development is that it’s done by people, and people are imperfect and will make mistakes. To be able to minimize the impact of these mistakes it’s important to be able to quickly (but carefully) test and release bugfix versions to Salt, likely resulting in more frequent, smaller releases.

The Salt core team will be opening and merging PRs from develop and neon to **master** (based off 2019.2.1) as soon as this SEP is published to demo how the core team will handle the PR migration process.

## Hotfix / Patch Release

Despite our best efforts, we will likely still encounter bugs after release during this transition. To reduce the impact of these bugs **master** will stay in a feature freeze for 2 weeks after release, in order to fix bugs that were not encountered until after release.

The Salt core team will focus on these bugs, but quality bug fixes submitted by the community that are properly tested and documented may also be accepted during this period.

After **master** is able to have new features merged, if new, severe bugs without a workaround are reported, a short-lived hotfix branch may be created. This would be an issue that:

- Affects core Salt functionality.
- Has no workaround (i.e. requires a code change).
- Has a reasonably quick turnaround (the fix doesn’t require a serious refactoring).

This branch will go through our release process - testing in the CI/CD pipeline, manual tests, RC, and release. This branch will then be merged into **master**.


## Automation

To support the CI/CD approach, the Salt core team will be improving the automation of the deployment pipeline, with the goal of automatically producing nightly (or weekly) builds that flow through the entire build pipeline, including automated tests.

Other automation steps will be introduced to reduce friction for Salt community contributors. Documentation will be rewritten, published, and shared as necessary to remove as many blockers to contribution as possible.

## Documentation

One major challenge for the community is understanding changes coming in the next release. Focusing efforts on a single branch will help with this, but additionally Salt will create a changelog file to be updated by contributors, that will be used to document changes in a human-readable way, as specified in [SEP 1](https://github.com/saltstack/salt-enhancement-proposals/blob/master/accepted/0001-changelog-format.md). Updating this documentation will be required as part of the merge process - either from the contributor or by a Salt core team member.

Salt [documentation](https://docs.saltstack.com) will be updated to point to the most current release (2019.2.1 at the time of this writing). For users who have not yet upgraded, or those who are testing the unreleased version of Salt, Salt take an approach similar to the official Python documentation, and have a drop-down or some other way to easily select documentation for other Salt versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Salt will take an approach


We will also upgrade our communication processes to keep the Salt community more aware of the release timelines. We will share progress with the [#salt channel on IRC on Freenode](http://webchat.freenode.net/?channels=salt&uio=Mj10cnVlJjk9dHJ1ZSYxMD10cnVl83), the #release channel in the [SaltStack Community Slack](https://saltstackcommunity.herokuapp.com/), and the [Salt Users mailing list](https://groups.google.com/forum/#!forum/salt-users).

## Versioning (Naming)

With the `neon` release, to indicate the new change in release process, Salt will change to a new, non-date based version schema beginning at 3000. The version will be MAJOR.PATCH. For a planned release containing features and/or bug fixes the MAJOR version will be incremented.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 3000, and not just use an epoch to indicate that you are going back to semver?

https://www.python.org/dev/peps/pep-0440/#version-epochs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed using epoch. However we're not going 100% semantic versioning. Instead of the major number indicating breaking changes, it's purely indicating features. Breaking changes will be handled by giving a 3 release (1 year) notice before they happen. Using the 3000, 3001, ect version scheme allows us to have a predictable release number for breaking changes.

I.E. We note in the change log of 3001 there will be a breaking change in 3004.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gtmanfred I realized my last comment did not fully answer your question. I thought you were talking about using and epoch date for the version. Not using a version epoch pointed out in the document you linked to. The primary reason for avoiding version epoch and using 3000 instead is to avoid any complications when packaging across the many distros and OSes. While we could likely find ways to make sure versions with epochs work with OSes like Windows. Avoiding that all together and going with 3000 is just and easier and more direct path forward.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't you going for actual semantic versioning?

Copy link

@plinkable plinkable Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why the version number should change in this manner rather than simply sticking with date-based versions..

Right now the code names are meaningless to me and I have to trundle off to the website to look them up. Making the version number meaningless doesn't seem to add anything, just make it more complicated to understand which version is which. Perhaps for the first few releases we can keep track in our heads, but after 10 or 20 it's going to be a pain to work out just how old a particular release is.

I realise most packages don't use dates, and it's not a problem for them, but salt does, and it's been helpful to know - without looking anything up - just how old someone's version is when trying to help them with a problem. I guess this is mostly a community support request rather than anything, but I wanted to make sure it was at least noted and hopefully even explained before being set into stone.


With the increased release cadence in order to allow proper planning for feature deprecation, Salt will introduce a minimum 1-year, expected 3 major release cycle. If `neon` ships in January 2020, any new deprecations will not be removed until January 2021 at the earliest.

Note that this only applies to deprecations within Salt’s control. If a 3rd party changes, Salt may release updates earlier.

## Support matrix

To be able to focus on the stability and innovation in the Salt platform, we will be adopting the industry-standard approach of no longer supporting older releases. Salt will provide select support for serious bugs and CVEs for the most recent release. Minor bug fixes will be targeted in the next scheduled Salt release.
Comment on lines +132 to +134
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some clarification is needed here. On the Nov Office Hours I asked a question re. this and the answer was that the product support lifecycle wasn't changing.
So, will older releases be supported (however limited) or not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was discussed last Community Open Hour. We also discussed it today 😄


Practically, the level of support is what's changing. In the past, old releases saw bug fixes for minor issues, and even features may have been backported to a point release 👀

We will be providing a (more limited support) - releases will not be completely abandoned on their release date. The first phase of support is that we will maintain a code freeze after a release.

During the freeze, any major issues that were not uncovered in internal or RC testing will get fixed and bundled up into a MAJOR.1 release. Features will not be added at this time.

This process should address any ma jor issues - crashes, data loss, etc.

After any major issues are found, fixed, and released (MAJOR.1) and we code thaw, that's when

serious bugs and CVE

support kicks in.

Minor bugs with workarounds? Features? They won't go into the older release(s). CVEs absolutely will (as long as the release isn't EOL, of course). When/if major bugs miss the code freeze period, we'll have to make the determination at that point if they're important enough to cut a .2 release and/or backport.

The goal of this process is that Salt releases should be much more stable and reliable, as well as being able to deliver more fixes and features faster, because we can focus on the one major release.

Does that clarify things?



## Alternatives
[alternatives]: #alternatives

The primary alternative is to keep doing the things that Salt has been doing, and try to make incremental changes. The costs (both opportunity and real) are no longer acceptable.

Long and unpredictable release times, unstable branches, missing (or extra) fixes and features, and regressions are all problems with our existing workflow. Trying to fix these problems here and there will not make the huge positive impact for our community that we are trying to make.

## Unresolved questions
[unresolved]: #unresolved-questions

Hopefully all the questions have been answered in this SEP. Upgrading Salt should continue to work like it always has - our changes are focused on the external development process of Salt. If you feel like you have unanswered questions, please come ask them at the Salt Office Hours on October 1st, 2019, or find us in [#salt on IRC on Freenode](http://webchat.freenode.net/?channels=salt&uio=Mj10cnVlJjk9dHJ1ZSYxMD10cnVl83), the [SaltStack Community Slack](https://saltstackcommunity.herokuapp.com/), or the [Salt Users mailing list](https://groups.google.com/forum/#!forum/salt-users).
Copy link
Contributor

@cachedout cachedout Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the biggest unanswered question I have is around the topic of deprecations.

As I understand this proposal, majors are released on a 90 day schedule. If a need arises for a minor, say because of a critical bug fix, then one will be issued at some point during the 90 day cycle, though this isn't always the case, and ideally (?) is never the case.

So, assuming an ideal cycle where there are four releases a year, it seems like deprecations could happen at any of those major releases. (Please correct me if this is wrong.)

So, I have to wonder if this is setting ourselves up for a situation wherein most, or even all, upgrades are between majors, and so, even to to take in minor bug fixes, users are exposed to deprecations and a resulting change in behavior. I think we all agree that avoiding such a bind is desirable.

Could there be room in a single-branch strategy for a major-minor model to still exist? For example, what if we took the four releases a year and structured them this way?

Quarter Type Policy
Q1 Minor (breaking changes NOT allowed)
Q2 Minor (breaking changes NOT allowed)
Q3 Minor (breaking changes NOT allowed)
Q4 Major (breaking changes ARE allowed)

IMHO, this keeps most of the benefits you've described in this proposal but it also allows users more predictability when it comes to upgrades. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the increased release cadence in order to allow proper planning for feature deprecation, Salt will introduce a minimum 1-year, expected 3 major release cycle. If neon ships in January 2020, any new deprecations will not be removed until January 2021 at the earliest.

I mean, I guess the answer is - yes there could be breaking changes in any release (which is not different from the current major release process). But they won't be "surprise here is something that you've never heard of!" kinds of changes, just ones that have seen a minimum of 1-year of warnings calling out the deprecation.

I expect we would require longer lead times for more invasive breaking changes.

Thinking about the alternative that you propose where we have a once-per-year breaking change release, and feels icky to me.

Initially I thought that could make sense, but the problem with that is that if we have one breaking change release per year we literally have to break everything that we're going to break in that release.

Instead of making minor changes over time - many of which will likely affect a subset of our user base, we would be making different changes at one time that would likely affect our entire user base.

The other side effect there is that there would be some kind of campaigning closer to the deprecation release to do things that would break for people with less warning. If we simply state that every deprecation needs at least a year of warning, that puts everyone on a level ground, and there's no temptation to "just make an exception this one time".

That may be a small side effect, but it is still there.


Ultimately, I prefer the potential for breaking changes with each release - this spreads out the potential issues: if there's something that wasn't caught in testing or RC with a deprecation, the hope is that it will be a minor issue, and we'll be able to get it out in the patch release. Worst case it will be super gnarly to fix and we'll just have to roll back that change in the patch release. But either case should cause much less impact overall.

Compare that to the case where we may have several deprecations go wrong in a single release. All of the sudden the world is on fire for everyone, and the core team is running around trying to fix all of the things. Ick.

I prefer small changes over time, to sweeping changes all at once (but, maybe I'm in the minority 🤷‍♂).


# Drawbacks
[drawbacks]: #drawbacks

The Salt team recognizes that we have users and customers on older versions of Salt [outside their support windows](https://www.saltstack.com/product-support-lifecycle/). The Salt core team is working with support teams to carve out a plan to make the transition to supported Salt versions smoother.

There are also a *lot* of PRs that we need to migrate. However, this will probably not be much more difficult than the merge forward and backport process that we’ve been used to, though with the benefit that when we finish with this set of PRs we will never have to do it again. Fixes and features will be added to one single branch, and be available in the next scheduled release.
Binary file added diagrams/diagrams_new-branch-strategy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added diagrams/diagrams_old-branch-strategy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.