Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Automated Y-Stream Releases #190

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .spellcheck-en-custom.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,14 @@ args
arXiv
backend
backends
backport
backports
backporting
benchmarking
Bhandwaldar
brainer
bugfix
bugfixes
Cappi
checkpointing
chunkers
Expand Down Expand Up @@ -201,6 +206,7 @@ Radeon
RDNA
README
rebase
rebasing
Ren
repo
repos
Expand All @@ -215,6 +221,7 @@ safetensor
safetensors
Salawu
scalable
Schedulable
SDG
sdg
SDK
Expand Down
134 changes: 134 additions & 0 deletions docs/ci/ci-automated-release-workflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Creating Automated Releases: Design Document

## Motivation & Overview

Presently, the release processes for every library within this `instructlab` GitHub organization is entirely manual:

For example, for a typical y-stream release, a maintainer has to:

1. Manually create a new release branch -- e.g., `release-0.y.0`,
2. Manually create a pull request against `release-0.y.0` to cap the versions on some of the dependencies defined in their library's `requirements.txt` file,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this requirement to cap coming from? Aren't we asked to not cap?

Only apply "caps" to dependencies (using <) when that dependency has established a pattern of producing new releases with breaking changes.

Without this, steps 3-5 are no longer necessary.

Copy link
Contributor Author

@courtneypacheco courtneypacheco Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been capping certain dependencies in the core repo since October 2024: https://github.com/instructlab/instructlab/pulls?q=is%3Apr+%22deps%3A+cap%22+is%3Aclosed+

To elaborate, we have been capping our own InstructLab library dependencies to ensure that if we create a new release from the core repo, that new release won't automatically consume potential breaking changes from one of our own libraries.

If we don't cap certain dependencies, then we'll need to address those breaking changes in one or more pull requests and follow up by publishing a Z-stream release so that end users can consume the fixes. So I think for end users in particular, it can be extremely frustrating to pull the latest InstructLab release from 1-7 days ago, only to find out that the InstructLab release doesn't work because it's pulling an incompatible package.

Copy link
Contributor Author

@courtneypacheco courtneypacheco Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, to be clear, the "automatic capping" feature is not required to be used by anybody. Maintainers can ignore the feature if desired. And since the automation described here will create a pull request that someone has to manually approve and review, the dependency capping changes will not be merged automatically. 😃

3. Optionally trigger an E2E test against that pull request,
4. Wait for all pull request CI checks to complete,
5. Manually request two maintainers to approve the pull request, and
6. Manually create a release from the GitHub UI using that new branch
Comment on lines +11 to +14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the commas and "and" aren't really necessary in a numbered list


This entire process takes at least 10 minutes of manual work, plus however long it takes for the pull request's checks to complete. (In some repositories, like the core repo, this can take 2+ hours.)

Going forward, we should automate these release processes so that contributors and maintainers can focus more on development work and less on creating actual releases.

## Generic Automation Workflow: Major Releases, Minor Releases, and Z-Stream Releases

### Brief Overview of the Automation

The automation logic described in this dev-doc will be published in the form of an in-house GitHub action called `create-automated-release` and it will therefore be callable from any workflow file. Thus, it is strongly recommended that each repository maintainer creates a `.github/workflows/automated-release.yml` workflow file to call this `create-automated-release` GitHub action from.

### Goals of the Automation

The automated process should be:

1. Configurable so that any library maintainer can configure the automation to meet their specific repository's needs
2. Schedulable so that releases are generated according to a specific cadence

Scheduled releases are generally important for setting expectations around release cadences, but they are by no means required for every library and may not even be applicable to some in this GitHub organization.

## Y-Stream Automation Workflow

### Overview of Y-Stream (Minor) Release Automation

Y-stream (minor) releases have historically been handled differently from Z-stream releases. Z-stream releases oftentimes involve backports for bugfixes and may require manual code rebasing to get those backports merged into the appropriate existing release branch. Therefore, we can think of Y-stream release logic as the "basis" for Z-stream release logic, which takes the Y-stream logic and builds upon it to account for backporting and other desirable actions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Y-stream (minor) releases have historically been handled differently from Z-stream releases. Z-stream releases oftentimes involve backports for bugfixes and may require manual code rebasing to get those backports merged into the appropriate existing release branch. Therefore, we can think of Y-stream release logic as the "basis" for Z-stream release logic, which takes the Y-stream logic and builds upon it to account for backporting and other desirable actions.
Y-stream (minor) releases have historically been handled differently from Z-stream releases. Z-stream releases often times involve backports for bugfixes and may require manual code rebasing to get those backports merged into the appropriate existing release branch. Therefore, we can think of Y-stream release logic as the "basis" for Z-stream release logic, which takes the Y-stream logic and builds upon it to account for backporting and other desirable actions.


### Configurable Components

As mentioned above, there are configurable components within this release process automation. The diagram in the next section references two configurable components:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
As mentioned above, there are configurable components within this release process automation. The diagram in the next section references two configurable components:
As mentioned above, there are configurable components within this release process automation. The diagram in the next section references two configurable components


#### Trigger Schedule

The trigger schedule defines the day and time (in UTC) when the release process will kick off. This schedule can be disabled if desired, and maintainers can trigger the release process manually when needed instead.

As mentioned in the brief overview of the automation, this in-house `create-automated-release` action will be callable from any workflow file. Thus, each library maintainer who wants to create a scheduled release should first create a `.github/workflows/automated-release.yml` workflow file in their repository and define a schedule for it. For example:

```yaml
name: Create Y-Stream Release

on:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to make this also to be available to run outside of this cron schedule for any reason, by exposing the inputs for the workflow ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as mentioned near the bottom of the document, there is an option to add your own trigger conditions, like workflow_dispatch. When using workflow_dispatch, users can manually kick off a release process.

The trigger conditions is defined at the Git workflow level, but not at the create-automated-release level.

schedule:
- cron: '30 1 1,15 * *' # Triggers at 1:30am UTC every 2 weeks on the 1st and the 15th day of each month
```

#### Custom List of Dependencies to Cap
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. Capping is not something we should automate. It's an exceptional situation, not a matter of course.


With each release, some library maintainers may want to cap the version of certain dependencies within their `requirements.txt` file as well as specify the desired upper cap for each one.

The list of dependencies to cap should be provided in a file called `automated-release-config.yml`. This configuration file may be expanded in the future to accommodate more configurations as needed. Example:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@courtneypacheco Would this automated-release-config.yml file be per y stream release ? Is there a way we could add a checkpoint to verify in the beginning of the workflow to make sure this file exists and if it does, make sure it exists with the necessary details and format ? We could skip the secondary validation, if the file doesn't exist, which seems to be a possibility based on your workflow diagram.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also would it be a good idea to version this file within the y stream release, if things change on us ? In other words, do we want this file to be immutable once the release process kicks off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, per y-stream release.

I definitely do plan on adding a check to see if the file exists. :) If it the file doesn't exist, then the code will resort to built-in default values where applicable.

Since dependency capping is the only "configurable" component right now, the default value for the configurable list of components would be an empty list.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.


```yaml
y_stream_release:

dependency_caps:
enable: true # optional parameter. If set to false, then none of the dependencies in `requirements.txt` will be capped.
packages:
instructlab-sdg: "+0.1.0"
instructlab-eval: "+0.2.0"
instructlab-training: "+1.0.0"
Comment on lines +70 to +73
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion here, lmk what you think

Suggested change
packages:
instructlab-sdg: "+0.1.0"
instructlab-eval: "+0.2.0"
instructlab-training: "+1.0.0"
packages:
- name: instructlab-sdg
greater_than: 0.1.0
- name: instructlab-eval
greater_than: 0.2.0
- name: instructlab-training
greater_than: 1.0.0

You could have similar fields equals or less_than when applicable

```

The keys specify the dependencies to cap. (In this case, we only have three: `instructlab-sdg`, `instructlab-eval`, and `instructlab-training`. The other dependencies in the `requirements.txt` file will be ignored and left untouched.)

The value for each key specifies the cap relative to the current lower bound. For example, let's say the current `requirements.txt` file looks like this:

```bash
instructlab-sdg>0.20.0
instructlab-eval>0.1.0
instructlab-training
```

In this case, the automated logic will create a pull request that updates the `requirements.txt` file like so (ignoring the inline explanations I provided as comments):

```bash
instructlab-sdg>0.20.0,<=0.21.0 # increment by 0.1.0 because of the '+0.1.0' in the `cap-deps.cfg` file
instructlab-eval>0.1.0,<=0.3.0 # increment by 0.2.0 because of the '+0.2.0' in the `cap-deps.cfg` file
instructlab-training # do nothing because there was no lower bound set.
```

If `dependency_caps` is not defined or `enable` is set to `False` under the `dependency_caps` key, then none of the dependencies defined in `requirements.txt` will be capped. If a dependency to cap is defined in `dependency_caps` and that dependency doesn't exist, then it will be ignored.

### Example Workflow File

Below is an example workflow file used to call this in-house GitHub action:

```yaml
name: Create Y-Stream Release

on:
# Run every Monday at 1AM UTC
schedule:
- cron: '0 7 * * 1 '

# Allow manual dispatch, too
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Allow manual dispatch, too
# Allow manual dispatch too

workflow_dispatch:
inputs:
pr_or_branch:
description: 'pull request number or branch name'
required: true
default: 'main'

jobs:
create-release:
runs-on: ubuntu-latest
steps:
- name: "create-automated-release"
uses: instructlab/ci-actions/create-automated-release@v1
with:
release_config: ".github/automated-release-config.yml" # points to where the library's release config is located in its repository
```

### Y-Stream Release Flow Diagram

![Automated workflow for creating new GitHub releases](../images/design-diagram-for-automated-releases.png)

## Z-Stream Release Automation Workflow

### Overview of Z-Stream Release Automation

To be added at a later date.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.