Skip to content

Conversation

@gilles-peskine-arm
Copy link
Contributor

@gilles-peskine-arm gilles-peskine-arm commented Oct 27, 2025

Create a release preparation script. Resolves #232.

This script performs the following steps:

  • Assemble the changelog.
  • Bump version. Currently only supports bumping the product version, not the SO version (ABI). Currently relies on bump_version.sh which doesn't exist in TF-PSA-Crypto, it could be adapted to be standalone at little effort.
  • Create a release archive that includes the generated files and that has GEN_FILES off. Uses GNU tar. Also create a shasum file.
  • Create some draft release notes. This doesn't do a complete job, I've just tried to automate the bits that are easy to automate.

Features:

  • Creates git commits for the parts that go into the release tag (version bump, changelog finalization).
  • The resulting archive should be reproducible (with the same tooling versions, and starting from the same commit), but I haven't tested this much.
  • Tries to be robust and just give up if something looks wrong.

Limitations:

  • Currently only supports releases as we've done TF-PSA-Crypto 1.0.0 and Mbed TLS 4.0.0, not what we do for 3.6.x where we commit generated files. I'm not sure if it's worth it to also support the 3.6.x structure. It depends partly on how we'll to 1.x/4.x LTS releases.
  • No support (yet) for bumping the ABI version.
  • Relies on GNU tar. We could use Python built-in archive support instead. I think it would actually make the code simpler, but it would increase our dependent on this script: with GNU tar, it's still doable to create a similar archive manually.
  • The Mbed TLS release does not check if TF-PSA-Crypto seems to be in a releasable state. You're supposed to run the release script in TF-PSA-Crypto first, then update the submodule in Mbed TLS, but this is not enforced.
  • Does not do anything about tags.
  • Enforcing any kind of consistency between releases is completely out of scope.
  • Output verbosity is not always optimal and not configurable. I think that's ok for MVP and can be improved later.
  • Running this script on the CI is out of scope (covered by Make a mock release in the release/nightly CI job mbedtls#3255).
  • I put the executable script directly in the framework, so it can't be customized per branch. I think that's ok, because changes to the release process tend to involve considerations of all the releasable branches. We can have branch-dependent behavior in the script.

Contributes to:

PR checklist

  • TF-PSA-Crypto PR provided # | not required because: standalone script
  • development PR provided # | not required because: standalone script
  • 3.6 PR provided # | not required because: standalone script

Set up the armature of a release script: command line interface, information
about the build tree, abstract class for release preparation steps.

Signed-off-by: Gilles Peskine <[email protected]>
That made the build non-reproducible.

Addresses Mbed-TLS/mbedtls#9521

Signed-off-by: Gilles Peskine <[email protected]>
Modern versions of Python emit a DeprecationWarning about
`datetime.datetime.utcfromtimestamp()`. It didn't really matter here, but do
stop using it.

Signed-off-by: Gilles Peskine <[email protected]>
Create a release archive containing the git content and the generated
files (including files from submodules), with `CMakeLists.txt` edited to
turn off `GEN_FILES` by default.

Arrange for the timestamps of generated files to be more recent than the
timestamps of files checked into git, in a reproducible way.

Uses GNU tar to assemble the pieces.

Signed-off-by: Gilles Peskine <[email protected]>
Add a step to the release preparation script to assemble the changelog and
commit that change into git.

Make sure to create a changelog section for the desired version number even
if there are no changes. Finalize that section by editing in a release date.

Do nothing if the changelog already contains a finalized section for the
desired version and release date.

Signed-off-by: Gilles Peskine <[email protected]>
Fix the case when there are no pending changes in the top version section.
This is not generally expected to happen, but it can happen when following
the release process and there have not yet been any changes since the last
release.

Signed-off-by: Gilles Peskine <[email protected]>
Run `bump_version.sh` and commit that change into git.

Do nothing if there is no version number changes.

This commit only supports bumping the product version. Any ABI
bump (SOVERSION changes) must have been done beforehand.

Signed-off-by: Gilles Peskine <[email protected]>
Signed-off-by: Gilles Peskine <[email protected]>
@gilles-peskine-arm gilles-peskine-arm added the needs-review Every commit must be reviewed by at least two team members, label Oct 27, 2025
@gilles-peskine-arm gilles-peskine-arm added needs-reviewer This PR needs someone to pick it up for review size-s Estimated task size: small (~2d) priority-high High priority - will be reviewed soon labels Oct 27, 2025
Copy link
Contributor

@minosgalanakis minosgalanakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass review. I have confirmed that given the 4.0.0 input it generated a viable artifact.

Most of my connets are intended to be considered design review suggestions. Mostly I would polish the interface with the user. What I would propose that we need

  • Single step argument
  • Some way to output the commands used to generate the release. It is common to be commited in the body of the commits when we are invoking a script with parameters.
  • I am proposing to incorporate the bump_version into this script. We also need a way to bump the so versions.
  • More printed output so the user can see progress, can know what command is executed and which step is done when things suceed or fail. This is even more essential for CI deployment.
  • Clean up archives from files that do not need to be there, like .gitignore/DS files.
  • Import and call assemble changelog as a python module.
  • If we are modifying the --version parameter, the script could take --mbedtls-version and --tf-psa-version and package both releases in tf-psa-crypto and then mbedtls order.

Thank you for writting this script, it will greatly help in moving closer towards automating the release.

from mbedtls_framework.build_tree import guess_project_root

TESTS_DIR = os.path.join(guess_project_root(), 'tests')
FRAMEWORK_DIR = os.path.join(guess_project_root(), 'framework')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply strip it using relpath?

FRAMEWORK_DIR = os.path.join(guess_project_root(), 'framework')
FRAMEWORK_DIR = os.path.relpath(FRAMEWORK_DIR, start=os.getcwd())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do all the work to construct and deconstruct a path when we already know that the end result must be 'framework'? The path is where the files are located relative to the root of the project, not related to the current directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a compromise of 'os.path.join('./', 'framework')' ?

Copy link
Contributor Author

@gilles-peskine-arm gilles-peskine-arm Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's still wrong, isn't it? It would be .\framework on Windows, AFAIK. But we want a reproducible path. What's wrong with writing 'framework' when we mean 'framework'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Mind you, there may be other occurrences of os.path.join that need fixing if we want to have the same output on Windows. I haven't gone and fixed those and I don't want to extend the scope here to Windows support.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No please don't :) I doubt we will be spinning windows executors to build a release


DEFAULT_OPTIONS = Options(
artifact_directory=pathlib.Path(os.pardir),
release_date=datetime.date.today().isoformat(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky. This is a sensible default from a code perpective,but it will impossible to produce a release and test/sign it off on the same day.

I would either remove the default and make arg.parser explode, or try to pass it from the 'os.env' so the release maker can set it when agreed and not bother with long command lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that the defaults would be suitable for running the script on the CI to make a mock release, and that we'd be more explicit when we run the script manually to make an actual release.

Or maybe we should change how we handle ChangeLog version sections: currently assemble_changelog.py checks the release date of the first section, and it creates a new section if it's a proper date but adds to the existing section if the date contains xx. I didn't want to change the semantics of assemble_changelog.py at the same time as introducing the release script, but there are other sensible policy. For example, we could decide that assemble_changelog.py always adds to the top section, and we create a new section as soon as a release is done. Then setting the release date would be a manual edit of ChangeLog.

def find_gnu_tar() -> str:
"""Try to guess the command for GNU tar.
This function never errors out. It defaults to "tar".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? If we don't have to handle about exceptions, you could exclitely return None if not found. Then handle it as an optional intput in Options and gracefully handle it when calling tar with an error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to make the script more complex (having a nullable tar_command) just to handle the unlikely case where GNU tar can't be found.


def run(self) -> None:
"""Assemble the changelog if needed."""
subprocess.check_call(['framework/scripts/assemble_changelog.py'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels wrong to invoke another interpreter to use functionality from another python script. Our can be made to interact with other with minimal changes:

For example in assemble_changelog.py can be achieved by

  • Move the argparser outside of the main and into the module.
  • call 'assemble_changelog.merge_entries(options)'

Alternatively

  • Use a sane default namedtuple for each python script that uses argparser ( we arleady partially do that)
Options = namedtuple("Options", ["dir", "input", "keep_entries", "output"])
DEFAULTS = Options(
    dir="Changelog.d",
    input="Changelog",
    keep_entries=False,
    output="Changelog",
)
  • Then call 'assemble_changelog.merge_entries(DEFAULTS)'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? assemble_changelog has a job and it does it. The fact that it's written in Python is an implementation detail.


# Files and directories that may contain version information.
FILES_WITH_VERSION = [
'CMakeLists.txt',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to include build_info.h in this file list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's included via include.

mtime = self.commit_timestamp()
self.call_git(['archive', '--format=tar',
'--mtime', str(mtime),
'--prefix', prefix,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal. .git is already ingored by git archive. How about we also ignore files like .gitingore?

GNU tar has commands like --exclude-vcs-ignores and --exclude-vcs

https://www.gnu.org/software/tar/manual/html_section/exclude.html

We could also include other temporary files

'--exclude=.DS_Store',
'--exclude=**/*.tmp',
'--exclude=**/*.swp'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we not include .gitignore in the archive? And why would we look for temporary files at time of the release? If they're already in git, they're part of the release, for good or for worse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats fair. If git-archive is ignoring those files then we can avoid that. And git modules is not a bit issue nowadays that we do not update the sumobules to the restricted ones.

calling `run()`.
"""

@classmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add an abstract method of done.

We need to discretely indicate when a step is over, which could toggle a internal boolean flag to indicate sucessfull completion. It can also include print statements, giving a report to the user of commands executed, files modified, and where the archives are created etc.

Also we could implement __enter__ and __exit__ so the Step classes are called with with(Step) to automatically call done()

e.g.

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc, tb) -> bool:
        print("{} has completed".format(self.name))
        self.done()
        return self

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add an abstract method of done.

What would done do? I added assert_preconditions because I wanted each step to default to insisting on a clean git status, otherwise they could accidentally end up committing or archiving something unintended. But I don't see a job for done.

We need to discretely indicate when a step is over, which could toggle a internal boolean flag to indicate sucessfull completion.

Why? The steps are deliberately as independent as possible, so that you can take a release candidate and tweak it into the next release candidate.

Also we could implement enter and exit so the Step classes are called with with(Step) to automatically call done()

Why? Context managers make it easier to use objects of a class in a controlled way. But here there is a single place where steps are instantiated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When something currently fails. which it did, I had no idea what was done and what was left in an incomplete state.I had to manually look into the git history and staged files to figure out why. Each step needs to communicate to the caller what it does, and when its done.

So as a role for done we could it:

  • communicate to the user that the step is complete
  • give artefact locations on the command line
  • give warnings if something is not in the right state
  • perform cleanup if required by some steps.

Why? The steps are deliberately as independent as possible, so that you can take a release candidate and tweak it into the next release candidate.

That is no longer required, as discussed above. Release candidates are recreated and no longer tweaked. The way changelog is assembled, it is cleaner and safer and easier to reproduce, when the changes are brought in and then the release is assembled.

anyway this is not a strong ask, I just found that it logically made sense since we already have a assert_preconditions as a entry point, and it would be balanced to have a matched cleanup stage.

env=env)

def read_git(self, cmd: List[str],
where: Optional[PathOrString] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where can be redudant. Git supports -C to run it on a different directory

e.g
git -C mbedtls log -n 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. where is implemented as git -C with a bit of automation for relative paths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have misunderstood the point of the where variable. If you want to querry a git folder that lies in a different directory than your current you can use the -C directive to point it there. So what I was saying is that the location could be part of the cmd argument.

version: Optional[str]

DEFAULT_OPTIONS = Options(
artifact_directory=pathlib.Path(os.pardir),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
artifact_directory=pathlib.Path(os.pardir),
artifact_directory=os.path.join(pathlib.Path(os.pardir), self.artifact_base_name() + "_artifacts"),

Parent director can contain a lot of files, that we may not want to override. We should create an explicit directory to store all the release related artifcats

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this script should create a new directory. Especially by default: users are likely to miss it, or it could clash with something unrelated. The parent directory seems like a sensible default to me: you'll create mbedtls-1.2.3.tar.bz2 next to your mbedtls directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are in the mbedtls directory, and you cannot invoke the script from the parent directory, the chances are most users to look for the files and not find them, not knowing its on the parent directory.

And given that mbedtls rarely sits on its own, it will be an already cluttered directory. And the script is generating more than archives, it generates several files, which terminal users have to clean up by hand. Putting it on a directory makes it easier to clean up and archive from a ci perspective.

how about we use /tmp to create the artefacts, and then:

  • Move them on the root of mbedtls/tf-psa-crypto where the script will be invoked. They will be visible because of git, and we can clean them up using git clean -dfx
  • Move them in a release_artifacts subdir inside mbedtls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken that different people have different habits.

Also .. is not a good default when there is an explicit --directory argument, it should be the parent of the source directory to be somewhat sensible.

At first I didn't want to touch any files inside the source tree, in case we wanted to do something that can't cope with uncommitted files. But I think we can reasonably assume that either the script will robustly ignore uncommitted file, or there will be a simple way to exclude files.

So I propose to change this to a directory in the source tree. Of course that's the default and the command line argument will stay.

# in markdown as line breaks, not as spaces, so an ordinary paragraph
# needs to be in a single Python logical line.
return f'''\
## Description
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put that on a string on the top of the file or a separate template file. We shouldn't be looking into the code to edit this and.

Also since we are generating this we should add the artifacts name, and mdsum using md5 hyperlinks, since the same script is creating and capturing that information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree about the separate file. It's easier to have a single file to edit. This is a string with a lot of substitutions, so if we store it externally, we'll have to manage some templating instead of just relying on the f-string syntax.

I don't understand about md5, did you mean sha256? The artifact names appear under “Checksum”. I mostly followed the existing structure of release notes.

That way, if the initialization of a step fails, it'll happen before any
external changes are done.

Signed-off-by: Gilles Peskine <[email protected]>
Don't change the disk file if the content hasn't changed.

Signed-off-by: Gilles Peskine <[email protected]>
Don't use a positional argument, for consistency with the release date.

Signed-off-by: Gilles Peskine <[email protected]>
Rename `--from` and `--to` to `--from-step` and `--to-step` for consistency.
They have short options so they don't need to be easy to type.

Signed-off-by: Gilles Peskine <[email protected]>
Using the parent of the source directory as an artifact directory makes
sense if the parent of the source directory contains multiple projects or
branches of Mbed TLS or TF-PSA-Crypto. It doesn't work for people who have
clutter there.

Also, the default artifact directory was the parent of the _current_
directory, whatever it is, and not the parent of the source directory.
Passing `--directory` without `--artifact-directory` resulted in a
nonsensical artifact directory location.

Default to placing artifacts in `release-artifacts` in the source tree. This
way, you can have multiple release candidates for the same branch in sibling
directories, and their artifacts won't overwrite each other.

Signed-off-by: Gilles Peskine <[email protected]>
The list of submodules won't change during the process. So get it once and
for all.

In support of this, move some basic git-related methods to the Info class.

Signed-off-by: Gilles Peskine <[email protected]>
index = self.git_index_as_tree_ish()
mtime = self.commit_timestamp()
self.call_git(['archive', '--format=tar',
'--mtime', str(mtime),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git archive --mtime requires a recent git. Ubuntu 22.04 isn't good enough. Ubuntu 24.04 is ok.

That's a reasonable assumption for a developer machine. Unfortunately, our CI doesn't have an Ubuntu 24.04 image yet, only 22.04.

We could avoid this by making a commit with the GEN_FILES change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of git archive --mtime=…, which can't be used in our CI yet, we could use faketime, which is already present in our docker images.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the Dockerfile anyway, to set the user name and email so that Git can make a commit. And I have a working Ubuntu 24.04 Docker image with little fuss. So I think the way to go is to add the Ubuntu 24.04 image, which is long overdue anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-design-approval needs-review Every commit must be reviewed by at least two team members, needs-reviewer This PR needs someone to pick it up for review priority-high High priority - will be reviewed soon size-s Estimated task size: small (~2d)

Projects

Development

Successfully merging this pull request may close these issues.

Automate release archive creation Improve file timestamps in release tarballs mbedtls 3.6.1 tarball missing submodules

2 participants