feat(ci): automate release process#3148
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Pull Request Test Coverage Report for Build 21821785755Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
636030e to
3291139
Compare
|
/retest |
Krishna-kg732
left a comment
There was a problem hiding this comment.
curious: the SDK release workflow uses OIDC trusted publishing for PyPI (no secrets needed), but this PR uses PYPI_API_TOKEN. Was there a specific reason for choosing the API token approach over trusted publishing? Just want to understand the tradeoff — both work fine for our release cadence, but OIDC avoids managing secrets
| exit 1 | ||
| fi | ||
|
|
||
| BRANCH="release-${VERSION}" |
There was a problem hiding this comment.
Existing release branches use release-X.Y format (release-1.9, release-2.0, release-2.1), per issue #2155. This will create release-2.2.0 instead. Should be :
| BRANCH="release-${VERSION}" | |
| MAJOR_MINOR=$(echo "$VERSION" | cut -d. -f1,2) | |
| BRANCH="release-${MAJOR_MINOR}" |
| echo "Running make generate" | ||
| make -C "$REPO_ROOT" generate | ||
| echo "Completed make generate" | ||
|
|
There was a problem hiding this comment.
| sed -i "s/__version__ = \".*\"/__version__ = \"$NEW_VERSION\"/" "$PYTHON_API_VERSION_FILE" | |
| echo "Updated Python API version to $NEW_VERSION" |
$PYTHON_API_VERSION_FILE is git-added but never modified by the script. The init.py version won't be updated, and check-release.yaml will fail on the mismatch. Add before git add:
|
@Krishna-kg732 It's a draft PR and If you'd like to work on it, please feel free to take it over since I won’t be able to work on it this month. |
|
@Krishna-kg732 Given that Trainer v2.2 release is coming, it would be great if you could finalize this work! |
|
Hi @Krishna-kg732 feel free to take this up. We do example change log generation with You can try replicating this. let me know if need more help for this. |
|
@Krishna-kg732 If you haven’t started yet, please wait until next week. I will try to work on it over the weekend. For this PR, the only remaining task is testing a release on the forked repo. |
I’ve already implemented the release workflow and will be opening a separate PR shortly. |
|
Hi @Krishna-kg732 actually we need this feature. Already raised PR for that, can you help to review that please. |
|
Thanks Akash, I’ll take a look at #3231 shortly and review it. |
793f2a7 to
133eff9
Compare
133eff9 to
49e0357
Compare
75b1b24 to
1d39734
Compare
|
I have tested this automation in my fork repo for release version Steps:
make release VERSION=4.0.0 GITHUB_TOKEN=<token>
3. Once the release PR is merged to master, [release](https://github.com/milinddethe15/kf-trainer/actions/runs/22284743190/job/64461002955) action will be triggered
where,
Chart:
Image:
Also added release doc for users to understand release flow: https://github.com/milinddethe15/kf-trainer/blob/feat/automate-release/docs/release/README.md |
…n checks Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
…eration Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
…kflow and upgrade git-cliff-action version Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
…ption Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
…tHub release Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
…nding and simplify release name Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
…eration and simplify workflow Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
…ine tagging process Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
…ration script Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
c98aad3 to
5334b82
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces an automated release flow for Kubeflow Trainer driven by VERSION updates: a local make release target prepares a release PR, CI validates the release PR, and a post-merge workflow performs tagging/branching, PyPI publishing, and GitHub release creation.
Changes:
- Add
hack/release.sh+make releaseto generate a release commit (VERSION/manifests/chart/changelog) and runmake generate. - Add CI workflows to validate release PRs (
check-release.yaml) and to automate releases after merge (release.yaml), plus supporting workflow_dispatch triggers. - Replace the old changelog generation script with
git-cliffconfiguration (cliff.toml) and update release documentation.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
hack/release.sh |
New release-prep script that bumps versions, updates manifests/chart, generates changelog, and commits. |
Makefile |
Adds release target to invoke hack/release.sh. |
docs/release/README.md |
Updates release documentation to the new PR-driven automated workflow. |
docs/release/changelog.py |
Removes legacy PyGithub-based changelog generator. |
cliff.toml |
Adds git-cliff config/template for changelog generation. |
.github/workflows/check-release.yaml |
New PR-time validation for release consistency (VERSION/tag/manifests/chart/python). |
.github/workflows/release.yaml |
New post-merge release automation (branch/tag, build+publish PyPI, GitHub release, dispatch image/chart publish). |
.github/workflows/template-publish-image/action.yaml |
Adds support for tagging images correctly when invoked via workflow_dispatch on tags. |
.github/workflows/build-and-push-images.yaml |
Allows manual dispatch publishing and updates publish gating logic. |
.github/workflows/publish-helm-charts.yaml |
Adds manual dispatch and concurrency settings for release-driven dispatch. |
.github/workflows/check-pr-title.yaml |
Adds area/release to ignored labels for PR title checks. |
| VERSION=${RAW_VERSION#v} | ||
| if [[ ${VERSION} =~ ${{ env.SEMVER_PATTERN }} ]]; then | ||
| echo "Version '${RAW_VERSION}' matches semver pattern." | ||
| else | ||
| echo "Version '${RAW_VERSION}' does not match semver pattern." | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
The semver validation strips a leading v and then matches a pattern that still allows an optional v, so an invalid VERSION like vv1.2.3 would incorrectly pass; validate RAW_VERSION against the pattern (as release.yaml does) or make the post-strip pattern disallow v.
| VERSION=${RAW_VERSION#v} | |
| if [[ ${VERSION} =~ ${{ env.SEMVER_PATTERN }} ]]; then | |
| echo "Version '${RAW_VERSION}' matches semver pattern." | |
| else | |
| echo "Version '${RAW_VERSION}' does not match semver pattern." | |
| exit 1 | |
| fi | |
| if [[ ${RAW_VERSION} =~ ${{ env.SEMVER_PATTERN }} ]]; then | |
| echo "Version '${RAW_VERSION}' matches semver pattern." | |
| else | |
| echo "Version '${RAW_VERSION}' does not match semver pattern." | |
| exit 1 | |
| fi | |
| VERSION=${RAW_VERSION#v} |
| if [ -z "$1" ]; then | ||
| echo "Usage: $0 <version>" | ||
| echo "You must follow this format: X.Y.Z or X.Y.Z-rc.N" | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
With set -o nounset, referencing $1 when no args are passed will error before this usage check runs; use an argument count check (e.g., $# -lt 1) instead so the script prints the intended usage message.
| # Generate and prepend new changelog section | ||
| TEMP_FILE=$(mktemp) | ||
| docker run --rm -u "$(id -u):$(id -g)" -v "$ABSOLUTE_REPO_ROOT:/app" \ | ||
| -e "GITHUB_TOKEN=$GITHUB_TOKEN" -w /app \ |
There was a problem hiding this comment.
GITHUB_TOKEN is optional per the warning, but the docker command expands $GITHUB_TOKEN under set -o nounset, which will abort when the variable is unset; pass it as ${GITHUB_TOKEN:-} or only include the -e flag when the token is present.
| -e "GITHUB_TOKEN=$GITHUB_TOKEN" -w /app \ | |
| -e "GITHUB_TOKEN=${GITHUB_TOKEN:-}" -w /app \ |
| # Update image tags in manifests | ||
| find "$MANIFESTS_DIR" -type f -name '*.yaml' -exec sed -i "s/newTag: .*/newTag: $TAG/" {} + | ||
| echo "Updated image tags in manifests to $TAG" |
There was a problem hiding this comment.
This script uses sed -i (also later for changelog insertion), which is GNU-sed-specific; other repo scripts (e.g., hack/python-api/gen-api.sh) branch on uname == Darwin to keep macOS support, so this should do the same or use a portable alternative.
| 1. Re-validates version and manifest tags. | ||
| 2. Builds and validates Python package artifacts. | ||
| 3. Publishes the package to PyPI (`kubeflow-trainer-api`). | ||
| 4. Creates release branch `release-<version-without-v>` if it does not exist. |
There was a problem hiding this comment.
The workflow creates branches named release-<major>.<minor> (e.g., release-2.1), but this doc says release-<version-without-v> which reads like release-2.1.0; update the wording to match the actual branch naming logic.
| 4. Creates release branch `release-<version-without-v>` if it does not exist. | |
| 4. Creates release branch `release-<major>.<minor>` (for example, `release-2.1`) if it does not exist. |
| # Only stable release tags | ||
| tag_pattern = "^v?[0-9]+\\.[0-9]+\\.[0-9]+$" | ||
| ignore_tags = ".*-(alpha|beta|rc).*" |
There was a problem hiding this comment.
tag_pattern + ignore_tags currently exclude -rc.* tags, so generating changelogs for successive RCs will likely diff against the last stable tag (and repeat entries) instead of the previous RC; include RC tags in tag discovery (or use a separate RC config) so RC-to-RC changelogs are incremental.
| # Only stable release tags | |
| tag_pattern = "^v?[0-9]+\\.[0-9]+\\.[0-9]+$" | |
| ignore_tags = ".*-(alpha|beta|rc).*" | |
| # Stable and RC release tags (ignore alpha/beta) | |
| tag_pattern = "^v?[0-9]+\\.[0-9]+\\.[0-9]+(-[0-9A-Za-z.]+)?$" | |
| ignore_tags = ".*-(alpha|beta).*" |
| docker run --rm -u "$(id -u):$(id -g)" -v "$ABSOLUTE_REPO_ROOT:/app" \ | ||
| -e "GITHUB_TOKEN=$GITHUB_TOKEN" -w /app \ | ||
| "ghcr.io/orhun/git-cliff/git-cliff:latest" --unreleased --tag "$TAG" -o - > "$TEMP_FILE" |
There was a problem hiding this comment.
The docker run invocation uses the third-party image ghcr.io/orhun/git-cliff/git-cliff:latest in the release script with access to the repository workspace and GITHUB_TOKEN, but the image is only pinned to the mutable latest tag. If this external image is ever compromised or replaced, an attacker controlling it can exfiltrate GITHUB_TOKEN and tamper with release artifacts or tags when maintainers run the release tooling. Prefer pinning this dependency to an immutable reference (e.g., a specific version tag plus digest) or hosting a vetted image/binary under the Kubeflow project to reduce supply chain compromise risk.
|
@milinddethe15 @Krishna-kg732 Please can we finalize this PR to automate Trainer releases? |
Hey @andreyvelich , apologies for the late reply, I was busy with uni tests previous weeks I'll get these addressed and set up a test release to validate the full flow. If it's faster to pick up @milinddethe15's PR instead, I'm totally fine with that too — just let me know how we'd like to proceed. |
If you can commit to the @milinddethe15 branch directly, that might be easier to move forward. |
|
This PR was ready for review as I remember.
Krishna let me know if you want to help, else I am happy to continue on this. |
yup sounds good , lets continue with this PR. |
…ease # Conflicts: # docs/release/README.md
|
@milinddethe15 @Krishna-kg732 Is this PR ready? |
|
@andreyvelich I will once test entire release flow. Will update you next week. |
|
Hey @milinddethe15 this looks great , could you please link test release here so we can move ahead with this PR |
|
@Krishna-kg732 @milinddethe15 Did you test the release in your local branch? We would like to release 2.1.1 with a hot fix soon: #3489, and having automation would be nice to test. |
|
I need to test it against the latest master branch. I’ll try to do that next week. |
Sure, sounds good! That PR has been open for quite some time, so if @Krishna-kg732 could help you to test it, that would be great! |
|
Hey @andreyvelich , Yup i will help with the test release for this |




What this PR does / why we need it:
To release a newer version of trainer, user has to run
make release VERSION=1.0.0 GITHUB_TOKEN=<token>and open PR with the generated commit.PYPI_API_TOKENsecret in repo) and create a GitHub release using git-cliff-generated changelog.This methods ensures release PR can be created by anyone and multiple maintainers can approve a release by LGTM on PR.
More detail in: #3148 (comment)
Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...format, will close the issue(s) when PR gets merged):Fixes #2155
Checklist: