Skip to content

Conversation

@achilleas-k
Copy link
Member

@achilleas-k achilleas-k commented Oct 22, 2025

Add new job type, for generating manifests using image-builder-cli. Support enabling the functionality using the IMAGE_BUILDER_EXPERIMENTAL environment variable.

Currently this doesn't support everything we need. Resolver options that require credentials and certificates aren't supported because we need a way to pass them to image-builder-cli. Also, long term, this feature is meant to support building images from bootc containers, which will require pulling the container to inspect it. We'll have to find a way to share that container with the build job to avoid pulling twice.

DRAFT: I should also add an integration test to build an actual image with this option enabled.

@achilleas-k achilleas-k force-pushed the newjob/ib-cli-manifest branch from 8bc3efd to 30a026e Compare October 28, 2025 14:05
@achilleas-k achilleas-k marked this pull request as ready for review October 28, 2025 14:25
@achilleas-k achilleas-k requested review from a team and thozza as code owners October 28, 2025 14:25
@achilleas-k achilleas-k requested review from croissanne and mvo5 and removed request for a team October 28, 2025 14:25
@achilleas-k
Copy link
Member Author

achilleas-k commented Oct 28, 2025

Integration test added. CI runners are having issues though (see #4909).

Copy link
Contributor

@mvo5 mvo5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Some quick drive-by musing, very excited about this feature!

"github.com/stretchr/testify/assert"
)

func TestParseManifestPipelines(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

// read XDG_CACHE_HOME, which fails when running as _osbuild-composer
// TODO: make sure we use the same rpmmd cache that's used by the depsolve
// job for consistency
ExtraEnv: []string{"XDG_CACHE_HOME=/var/cache/osbuild-composer/rpmmd"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this comment, lets make sure that we have a --cachdir in ibcli for you :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this osbuild/image-builder-cli#358 help you here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth a small comment now that we have osbuild/image-builder-cli#358 open that this can be replaced with --rpmmd-cache ? Unless we want to use it this way of course :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we should definitely use the option when we get it. I think we still haven't decided exactly how we're going to get ib-cli on the workers yet, so we'll need to see when that new option becomes available. But I'll definitely add the note.

if [[ "${IMAGE_BUILDER_EXPERIMENTAL}" != "" ]]; then
if echo "${IMAGE_BUILDER_EXPERIMENTAL}" | grep -q "image-builder-manifest-generation=1"; then
greenprint "Verifying usage of experimental job type"
# verify using log message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart!

@achilleas-k achilleas-k force-pushed the newjob/ib-cli-manifest branch 2 times, most recently from 6cba2df to a067379 Compare October 29, 2025 11:06
mvo5 added a commit to mvo5/image-builder-cli that referenced this pull request Oct 29, 2025
This commit adds a new option `--rpmmd-cache` that is similar
to the option in bootc-image-builder. It is also useful for
osbuild/osbuild-composer#4904
mvo5 added a commit to mvo5/image-builder-cli that referenced this pull request Oct 29, 2025
This commit adds a new option `--rpmmd-cache` that is similar
to the option in bootc-image-builder. It is also useful for
osbuild/osbuild-composer#4904
mvo5 added a commit to mvo5/image-builder-cli that referenced this pull request Oct 29, 2025
This commit adds a new option `--rpmmd-cache` that is similar
to the option in bootc-image-builder. It is also useful for
osbuild/osbuild-composer#4904
@achilleas-k achilleas-k force-pushed the newjob/ib-cli-manifest branch from a067379 to bba18ed Compare October 30, 2025 18:27
@achilleas-k
Copy link
Member Author

Rebased on #4909 to get the new terraform configs and get this thing tested.

@achilleas-k
Copy link
Member Author

I just realised while waiting for the tests to run that I also need to actually install ib-cli on the test runner when using it 😆

mvo5
mvo5 previously approved these changes Oct 31, 2025
Copy link
Contributor

@mvo5 mvo5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Some small suggestions inline but all could be followups (or ignored, some are really just subjective) I think.

func (s *Server) enqueueComposeIBCLI(irs []imageRequest, channel string) (uuid.UUID, error) {
var osbuildJobID uuid.UUID
if len(irs) != 1 {
return osbuildJobID, HTTPError(ErrorInvalidNumberOfImageBuilds)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(super nitpick) might be nice to provide a bit more information here, something like:

diff --git a/internal/cloudapi/v2/server.go b/internal/cloudapi/v2/server.go
index 7fd361516..080bfdf48 100644
--- a/internal/cloudapi/v2/server.go
+++ b/internal/cloudapi/v2/server.go
@@ -325,7 +325,7 @@ func (s *Server) enqueueComposeIBCLI(irs []imageRequest, channel string) (uuid.U
        logrus.Warnf("using experimental job type: %s", worker.JobTypeImageBuilderManifest)
        var osbuildJobID uuid.UUID
        if len(irs) != 1 {
-               return osbuildJobID, HTTPError(ErrorInvalidNumberOfImageBuilds)
+               return osbuildJobID, HTTPErrorWithInternal(ErrorInvalidNumberOfImageBuilds, fmt.Errorf("expected 1 image request, got %d", len(irs)))
        }
        ir := irs[0]
 

but it seems enqueCompse() has the same pattern so feel free to ignore.

// read XDG_CACHE_HOME, which fails when running as _osbuild-composer
// TODO: make sure we use the same rpmmd cache that's used by the depsolve
// job for consistency
ExtraEnv: []string{"XDG_CACHE_HOME=/var/cache/osbuild-composer/rpmmd"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth a small comment now that we have osbuild/image-builder-cli#358 open that this can be replaced with --rpmmd-cache ? Unless we want to use it this way of course :)

// TODO: extend to include all options
// - ostree
// - registration
// - bootc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this landed (and we added a way to pass a bootc-ref) we would be able to build bootc images in the service, right? That is exciting :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's the plan. The thing I want to work on after this is bootc manifest generation and the "passing of the container" from the depsolve worker to the build worker. Which doesn't seem trivial.


// handle experimental image-builder manifest generation option using the
// experimentalflags pkg from osbuild/images.
if experimentalflags.Bool("image-builder-manifest-generation") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(super nitpick) I personally would have combined this with the previous commit that added ImageBuilderManifestGeneration to the config but up to you how granular you want this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- aws/rhel-10.0-ga-aarch64
INTERNAL_NETWORK: ["true"]

# single test config for experimental feature
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(super nitpick) I wonder if this and the previous one (or even also the previous-previous one) commits could be combined but very subjective and fine as is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm going to add repository handling here and I need to fix up the test to install ib-cli, so I'll do some history rewriting to reduce the commit count, sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@achilleas-k achilleas-k marked this pull request as draft October 31, 2025 11:59
@achilleas-k achilleas-k force-pushed the newjob/ib-cli-manifest branch 4 times, most recently from 9e97ce6 to 0639ba7 Compare October 31, 2025 17:19
@achilleas-k achilleas-k marked this pull request as ready for review October 31, 2025 17:19
mvo5 added a commit to mvo5/image-builder-cli that referenced this pull request Nov 3, 2025
This commit adds a new option `--rpmmd-cache` that is similar
to the option in bootc-image-builder. It is also useful for
osbuild/osbuild-composer#4904
@achilleas-k
Copy link
Member Author

For this PR, I'm going to change the expected value in the test temporarily.

@achilleas-k achilleas-k requested review from lzap, mvo5 and thozza December 2, 2025 13:55
achilleas-k and others added 25 commits December 2, 2025 15:41
Worker function that shells out to image-builder to generate a manifest.
Mock the call to exec.Command() to capture the command and arguments in
order to test the call.
Add new job type, associated enqueueing method, const string and result
type for generating manifests using image-builder-cli.

The result type is an alias to ManifestJobByIDRestult.  This makes it
simpler to use the result of the new job type as an argument for the
osbuild job.
Add the JobInfo handler for the new job type and call it from the finish
job and error functions.

Also, match the job info type when reading manifests using the
ManifestJobInfo() function.
When searching for manifest job results from a list of dependencies,
match the image-builder-manifest job type as well.
Add ImageBuilderManifestJobImpl and implement its Run() method.

The job implementation struct is currently empty, but it will require
access to the same secrets as all the content resolver jobs.
When using the images library directly to generate a manifest, we have
access to the manifest source (pre-serialized manifest).  This object
provides a list of pipeline names, separated into build pipelines
(manifestSource.BuildPipelines()) and payload pipelines
(manifestSource.PayloadPipelines()).  The pipeline names are used in two
jobs:
1. In the osbuild job, the log in the result is unordered because
   pipeline results are returned in a map.  The list of pipeline names
   is used in this case to sort the job log in the order each pipeline
   was executed.
2. The koji finalize job uses the two pipeline name lists to separate
   the metadata from org.osbuild.rpm stages between build and payload,
   so that it can store package lists alongside the compose result
   separated into build packages and payload packages.

When using image-builder-cli, we don't have access to these pipeline
name lists directly.  To get them, let's parse the manifest into a
partial structure, identify all the pipeline names, and infer the build
pipeline names from the build property of the rest of the pipelines.

This approach violates one of our (soft) rules about job arguments and
job results: That they should always be considered opaque json blobs.
However, in the case of the manifest, I think it's acceptable,
especially since it only relies on the top-level structure of the
pipelines property, which we guarantee to be stable in osbuild.
Add an option to the cloudapi server config that enables generating
manifests with the ImageBuilderManifest job type.  Also will make it
possible to enable this using an environment variable.

Use the experimentalflags package from osbuild/images to read the
IMAGE_BUILDER_EXPERIMENTAL environment variable and check that
'image-builder-manifest-generation' is defined.  If it is, enable the
ImageBuilderManifestGeneration option in the composer server config.
Support enabling the option in tests and check that the correct job
types appear in the queue.
To reduce the number of args and make the function calls a bit more
readable, bundle the boolean options in the newV2Server() function into
a v2ServerOpts struct.

Co-authored-by: Michael Vogt <[email protected]>
When calling image-builder manifest, we need to set repositories
configured in osbuild-composer for distros.  Without this, the
repository configs embedded in image-builder are used, but in the
service, we don't want to rely on those.  Instead, we need to use the
repositories configured in the service itself.

Write any configured repository configurations to a file in a temporary
directory under the datadir/repositories/ subdirectory.  This works with
image-builder's --data-dir option, which looks for a repository file
matching the target distribution name under the repositories/
subdirectory.

The on-disk repository format that rpmmd loads is slightly different
from rpmmd.RepoConfig.  The main difference is the single BaseURL in the
on-disk format vs an array of BaseURLs in rpmmd.RepoConfig.  Here, we
temporarily duplicate the on-disk format definition and convert to it
when saving the repository configuration file.  We'll have to
deduplicate this at a later stage.
Support setting repository MTLS configs in the new
image-builder-manifest job type.  This is equivalent to the same feature
in the traditional depsolve job.  MTLS configs for repositories are
defined in the worker by mapping a repository URL to the appropriate
cert values (CA, key, and cert).  The values are set on the repository
configurations matching the repo URL and passed to the depsolver.
In the deploy script, after installing all the test RPMs, add an
override to the unit file that sets the IMAGE_BUILDER_EXPERIMENTAL
environment variable for the osbuild-composer service with any values
from the global runner environment.

If image-builder-manifest-generation is enabled, install image-builder
from the repos.

Add a single API test job (api.sh) to the gitlab CI config that uses the
new experimental job type.

In the api.sh script, when image-builder-manifest-generation is enabled
in the test, verify it was actually used by searching for the warning
that's logged when the job is enqueued.
Expand the TestComposeManifests() test function to also test for
retrieving manifests when they are generated by image-builder-manifest
jobs.

A mock image-builder-manifest job implementation is added which returns
the manifest expected by the test.
Expand the TestComposeRequestMetadata() test function to also test for
retrieving metadata when the manifest is generated by
image-builder-manifest jobs.
The repositories are meant to be in a subdirectory under the data-dir,
but prior to v41 of image-builder [1], it would check the root of the
data-dir instead.  Link the repositories file from the new location
($datadir/repositories/$distro.json) to the old location
($datadir/$distro.json) and add a note to switch between the two paths
when we have a reliable and convenient way of getting the image-builder
version number.

[1] https://github.com/osbuild/image-builder-cli/releases/tag/v41
When the PipelineNames are needed to process the osbuild job metadata,
don't use the property on the osbuild job args.  Instead, read them from
the osbuild job result.

The osbuild job args don't have the PipelineNames set when we're using
image-builder to generate the manifest, because the information is only
available after the manifest has been generated, before the osbuild job
is queued.  However, the PipelineNames should always be set on the
osbuild job result, regardless of the way that the manifest was created.
When dumping the manifests (manifest job results) from the database,
select both job types (manifest-id-only and image-builder-manifest) so
we also get the manifests from the new job type.
A bug in our librepo source prevents downloading packages with ^ in the
name (or any character that requires URL-escaping).  Until we get an
osbuild version with the fixed source module, let's stick to curl
sources.  The curl source already escapes URLs.

See osbuild/osbuild#2253
Handle writing the Blueprint and Repositories in functions.  This cleans
up the core function and let's us cleanly defer closing the files at the
end of the function scope.
Add subscription image options to the image-builder manifest job
arguments.  When set, writes them to a file and adds the appropriate
--registrations command line argument to he image-builder manifest call.
When using image-builder to generate the manifest, the value of the RHSM
fact is "image-builder-cli".  We want to be able to keep the value
consistent for all images built in the service, but for now the value
isn't configurable, so let's modify the test until we can change it.

See https://issues.redhat.com/browse/HMS-9819
When using image-builder to generate the manifest, there's no mechanism
to set the OpenSCAP-related RHSM facts.  We should make this automatic
internally (in images) based on the blueprint options.  Until then,
let's skip the check in our tests.

See https://issues.redhat.com/browse/HMS-9822
@achilleas-k achilleas-k force-pushed the newjob/ib-cli-manifest branch from 5b76f1e to 4f7a911 Compare December 2, 2025 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants