Skip to content

OSAC-1657: fix e2e-vmaas-full-setup-helm values file path#80834

Open
amej wants to merge 1 commit into
openshift:mainfrom
amej:bugfix/osac-vmaas-helm-values-path
Open

OSAC-1657: fix e2e-vmaas-full-setup-helm values file path#80834
amej wants to merge 1 commit into
openshift:mainfrom
amej:bugfix/osac-vmaas-helm-values-path

Conversation

@amej

@amej amej commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

OSAC-1657: fix e2e-vmaas-full-setup-helm values file path

Summary

Updates the VALUES_FILE default path in the osac-project-installer step registry to match the osac-installer repository's directory structure change from June 20, 2026.

The e2e-vmaas-full-setup-helm periodic job has been failing since June 8 with:

Error: open values/vmaas-ci.yaml: no such file or directory

Root Cause

The osac-installer repository restructured values files on June 20, 2026 (commit 5e1e886, PR #296):

  • Old structure: values/<env>.yaml
  • New structure: values/<env>/values.yaml

The CI step registry was not updated, causing the Helm deployment to fail when it couldn't find the values file at the old path.

Timeline:

  1. June 7: e2e-vmaas-full-setup-helm job introduced with VALUES_FILE=values/vmaas-ci.yaml (correct at the time)
  2. June 20: osac-installer migrated values files to per-environment directories
  3. June 21+: Job started failing because the path no longer exists

Changes

File: ci-operator/step-registry/osac-project-installer/osac-project-installer-ref.yaml:32

- default: "values/vmaas-ci.yaml"
+ default: "values/vmaas-ci/values.yaml"

Testing

  • ✅ YAML syntax validated
  • make registry-metadata passed
  • make jobs passed (sanitize-prow-jobs succeeded)
  • ✅ New path verified to exist in osac-installer repository (2731 bytes, valid YAML)
  • ✅ No other references to old path structure found
  • ✅ Only vmaas-ci Helm job affected (kustomize job unaffected)

Post-merge verification:
The next nightly run (0:00 UTC) of periodic-ci-osac-project-osac-test-infra-main-e2e-vmaas-full-setup-helm will validate the fix by successfully passing the osac-project-installer step.

Impact

Before: Nightly Helm-based VMaaS full-setup periodic job failing at installer step
After: Job will complete successfully through installation and testing

Scope: Single periodic job (e2e-vmaas-full-setup-helm)
Risk: Minimal - only updates a default value; jobs that override VALUES_FILE are unaffected

Related

Checklist

  • Configuration changes validated via make targets
  • No unit tests needed (configuration-only change, documented in implementation notes)
  • Post-merge monitoring plan documented
  • Root cause analysis written
  • Implementation notes written
  • Verification report written
  • Review completed

Summary by CodeRabbit

This PR fixes a broken CI configuration in the OpenShift release repository that has been causing the e2e-vmaas-full-setup-helm periodic job to fail since June 8, 2026. The job failed with "Error: open values/vmaas-ci.yaml: no such file or directory" due to a directory structure change in the osac-installer repository that was not reflected in the release repository's CI configuration.

What changed:
The osac-project-installer step registry definition has been updated to reflect the new directory layout in the osac-installer repository. The VALUES_FILE environment variable's default value in ci-operator/step-registry/osac-project/installer/osac-project-installer-ref.yaml (line 32) is changed from values/vmaas-ci.yaml to values/vmaas-ci/values.yaml. This aligns the CI configuration with the osac-installer repository's updated file structure where values files are organized in per-environment directories rather than at the root values directory.

Impact:
The fix is minimal and focused—only the default value for the VALUES_FILE configuration variable is updated. The change directly affects periodic jobs that use this step, with the vmaas-ci Helm-based setup job being the primary beneficiary. Jobs that explicitly override the VALUES_FILE parameter are unaffected.

The e2e-vmaas-full-setup-helm periodic job has been failing since June 8
with "Error: open values/vmaas-ci.yaml: no such file or directory".

Root cause: The osac-installer repository restructured values files on
June 20, 2026 (PR openshift#296) from flat layout (values/<env>.yaml) to
per-environment directories (values/<env>/values.yaml). The CI step
registry was not updated.

Fix: Update VALUES_FILE default path in osac-project-installer-ref.yaml
from values/vmaas-ci.yaml to values/vmaas-ci/values.yaml to match the
osac-installer repository structure.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

rh-pre-commit.version: 2.4.0
rh-pre-commit.check-secrets: ENABLED
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 22, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@amej: This pull request references OSAC-1657 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

OSAC-1657: fix e2e-vmaas-full-setup-helm values file path

Summary

Updates the VALUES_FILE default path in the osac-project-installer step registry to match the osac-installer repository's directory structure change from June 20, 2026.

The e2e-vmaas-full-setup-helm periodic job has been failing since June 8 with:

Error: open values/vmaas-ci.yaml: no such file or directory

Root Cause

The osac-installer repository restructured values files on June 20, 2026 (commit 5e1e886, PR #296):

  • Old structure: values/<env>.yaml
  • New structure: values/<env>/values.yaml

The CI step registry was not updated, causing the Helm deployment to fail when it couldn't find the values file at the old path.

Timeline:

  1. June 7: e2e-vmaas-full-setup-helm job introduced with VALUES_FILE=values/vmaas-ci.yaml (correct at the time)
  2. June 20: osac-installer migrated values files to per-environment directories
  3. June 21+: Job started failing because the path no longer exists

Changes

File: ci-operator/step-registry/osac-project-installer/osac-project-installer-ref.yaml:32

- default: "values/vmaas-ci.yaml"
+ default: "values/vmaas-ci/values.yaml"

Testing

  • ✅ YAML syntax validated
  • make registry-metadata passed
  • make jobs passed (sanitize-prow-jobs succeeded)
  • ✅ New path verified to exist in osac-installer repository (2731 bytes, valid YAML)
  • ✅ No other references to old path structure found
  • ✅ Only vmaas-ci Helm job affected (kustomize job unaffected)

Post-merge verification:
The next nightly run (0:00 UTC) of periodic-ci-osac-project-osac-test-infra-main-e2e-vmaas-full-setup-helm will validate the fix by successfully passing the osac-project-installer step.

Impact

Before: Nightly Helm-based VMaaS full-setup periodic job failing at installer step
After: Job will complete successfully through installation and testing

Scope: Single periodic job (e2e-vmaas-full-setup-helm)
Risk: Minimal - only updates a default value; jobs that override VALUES_FILE are unaffected

Related

Checklist

  • Configuration changes validated via make targets
  • No unit tests needed (configuration-only change, documented in implementation notes)
  • Post-merge monitoring plan documented
  • Root cause analysis written
  • Implementation notes written
  • Verification report written
  • Review completed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b8f31ace-142f-4f85-bfb1-d44ae7548199

📥 Commits

Reviewing files that changed from the base of the PR and between b6ad29a and 8784096.

📒 Files selected for processing (1)
  • ci-operator/step-registry/osac-project/installer/osac-project-installer-ref.yaml

Walkthrough

The default value of the VALUES_FILE environment variable in the osac-project-installer step definition is updated from values/vmaas-ci.yaml to values/vmaas-ci/values.yaml, reflecting a restructured Helm values file path.

Changes

Helm values path update

Layer / File(s) Summary
VALUES_FILE default path
ci-operator/step-registry/osac-project/installer/osac-project-installer-ref.yaml
Default value for VALUES_FILE env var changed from values/vmaas-ci.yaml to values/vmaas-ci/values.yaml.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: fixing a broken file path in the OSAC Helm values configuration, making it directly related to the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed This PR is a configuration-only change to a CI step registry YAML file (not a test file). It contains no Ginkgo tests, so the custom check for stable test names is not applicable.
Test Structure And Quality ✅ Passed PR only modifies CI configuration YAML and shell scripts, not Ginkgo test code. Custom check for test quality is not applicable to this PR.
Microshift Test Compatibility ✅ Passed This PR only modifies CI configuration (Helm values file path), not test code. No Ginkgo e2e tests are added, so the MicroShift Test Compatibility check does not apply.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR modifies only CI configuration files (YAML), not Ginkgo e2e tests. The check for SNO compatibility is not applicable as no new tests are added.
Topology-Aware Scheduling Compatibility ✅ Passed PR only modifies CI step-registry configuration (environment variable path); does not add/modify deployment manifests, operator code, or controllers subject to topology-aware scheduling checks.
Ote Binary Stdout Contract ✅ Passed This PR only modifies a YAML configuration file path, not test code or binaries. The OTE Binary Stdout Contract check is not applicable to configuration-only changes.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR. The change is only a CI configuration file path update (+1/-1 line in a YAML file), not a test addition, so IPv6/disconnected network compatibility che...
No-Weak-Crypto ✅ Passed PR modifies only a CI configuration file path; contains no cryptographic code, weak algorithms, custom crypto, or secret comparisons.
Container-Privileges ✅ Passed File is a CI step configuration with environment variables only; no container privilege escalation, host access, or privilege-related security configurations present.
No-Sensitive-Data-In-Logs ✅ Passed PR only changes a file path default value; no new logging statements added and no sensitive data exposed in the change.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@omer-vishlitzky

Copy link
Copy Markdown
Contributor

/lgtm
/approved

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 22, 2026
@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: amej, omer-vishlitzky

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@amej: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-osac-project-osac-test-infra-main-e2e-vmaas-full-setup-helm N/A periodic Registry content changed

Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@amej

amej commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

/pj-rehearse

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@amej: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@amej

amej commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 22, 2026
@amej

amej commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@amej: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-osac-project-osac-test-infra-main-e2e-vmaas-full-setup-helm 8784096 link unknown /pj-rehearse periodic-ci-osac-project-osac-test-infra-main-e2e-vmaas-full-setup-helm

Full PR test history. Your PR dashboard.
Details

Summary: Rehearsal Test Results

Status: ❌ Failed (infrastructure issue, not our fix)

Failure cause: The job failed during the ofcir-acquire step (bare metal server provisioning) with SSH connection errors to IBM Cloud. This happens before the osac-project-installer step
runs, so our VALUES_FILE fix was never tested.

What this means:

  • ✅ Our fix is still correct
  • ❌ The rehearsal didn't actually validate it due to infrastructure flake
  • ⏳ The real validation will happen when the PR merges and the next nightly periodic runs

This is expected for rehearsal jobs that require bare metal infrastructure - they can fail due to provider issues unrelated to code changes.

Recommendation: The PR is ready to merge. The fix will be validated by the actual periodic job after merge.

@amej

amej commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-osac-project-osac-test-infra-main-e2e-vmaas-full-setup-helm

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@amej: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@amej: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-osac-project-osac-test-infra-main-e2e-vmaas-full-setup-helm 8784096 link unknown /pj-rehearse periodic-ci-osac-project-osac-test-infra-main-e2e-vmaas-full-setup-helm

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@amej

amej commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Ref: https://redhat-internal.slack.com/archives/CBN38N3MW/p1782136661162139?thread_ts=1782121123.166539&cid=CBN38N3MW
Good news and bad news ℹ️

The previous origin-cli container build error is resolved — assisted-common-setup-infra succeeded this time (18m21s). The job progressed much further but hit a new, different failure in the osac-project-installer step.

Step execution:

  1. ofcir-acquire ✅ (42s)
  2. assisted-ofcir-setup ✅ (31s)
  3. ipi-install-rbac ✅ (11s)
  4. assisted-common-setup-prepare ✅ (1m27s)
  5. assisted-common-setup-infra ✅ (18m21s) ← fixed from last run!
  6. assisted-common-test ✅ (33m7s)
  7. assisted-common-post-install ✅ (12s)
  8. osac-project-installer ❌ FAILED (8m28s) ← new failure

Root Cause: Helm ownership metadata conflict

The osac-project-installer step failed when Helm tried to install the osac release. A Kubernetes Secret config-as-code-ig already exists in the osac-e2e-ci namespace but is missing Helm ownership labels/annotations:

Error: Unable to continue with install: Secret "config-as-code-ig"
in namespace "osac-e2e-ci" exists and cannot be imported into the
current release: invalid ownership metadata;

This means the secret config-as-code-ig was created by a prior step (likely during assisted-common-setup-prepare or assisted-common-test) without Helm ownership metadata. When the Helm chart in osac-project-installer later tries to install, it finds this pre-existing secret and refuses to adopt it.

There was also a transient CertManager CRD issue earlier (the CertManager custom resource wasn't found in [operator.openshift.io/v1alpha1](http://operator.openshift.io/v1alpha1%60)), but the step retried and the cert-manager operator eventually came up.

Is this related to your PR?

This failure is quite possibly directly related to your PR's change. Your PR fixes the values file path for the Helm chart (OSAC-1657). If the previous (incorrect) values file path caused the Helm chart to skip creating the config-as-code-ig secret itself — while another step created it without Helm metadata — then fixing the path could surface this ownership conflict.

This looks like it may be a pre-existing issue in the test pipeline that your fix is now exposing by correctly configuring the Helm chart. This is something the OSAC team would need to address in their Helm chart or test setup — for example by adding --force or adopting the existing secret with proper labels before the Helm install.

Next steps:

  • This is not a CI infrastructure issue — it's in the test's own Helm setup logic
  • I'd recommend raising this with the OSAC team or filing a follow-up against OSAC-1657
  • For your PR to merge, you could /pj-rehearse ack if you and the team agree this is a known/expected issue with the test setup that needs a separate fix

@amej

amej commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Investigation Complete: Helm Ownership Conflict Analysis

Thanks for the detailed feedback @amej! I've completed a multi-agent investigation of the rehearsal failure.

TL;DR

The values file path fix in this PR is correct and necessary.
The Helm ownership conflict is a separate, pre-existing bug in the osac-installer repository that was previously hidden because the job was failing earlier.


Root Cause

The config-as-code-ig secret ownership conflict occurs because:

  1. osac-installer's setup.sh script (lines 356-362) creates the secret manually:

    oc create secret generic config-as-code-ig \
      --from-literal=AAP_EE_IMAGE="${AAP_EE_IMAGE}" \
      --from-literal=AAP_PROJECT_GIT_URI="${AAP_PROJECT_GIT_URI}" \
      --from-literal=AAP_PROJECT_GIT_BRANCH="${AAP_PROJECT_GIT_BRANCH}" \
      --namespace="${INSTALLER_NAMESPACE}" \
      --dry-run=client -o yaml | oc apply -f -
  2. The secret is created WITHOUT Helm ownership labels (app.kubernetes.io/managed-by, meta.helm.sh/release-name, meta.helm.sh/release-namespace)

  3. When DEPLOY_MODE=helm, Helm encounters this pre-existing secret and fails validation

  4. The Helm chart does NOT create this secret — it only references it in values


Why This Wasn't Visible Before

Timeline:

  • Before this PR: Job failed early with Error: open values/vmaas-ci.yaml: no such file or directory
  • After this PR: Job progressed past the installer step, Helm now attempts installation ✅
  • New failure: Helm discovers the ownership conflict ❌

The values path fix allowed the job to progress far enough to expose this downstream issue.


Fix Options

Immediate fix (recommended for CI):
Add to osac-installer's /scripts/setup.sh before line 357:

oc delete secret config-as-code-ig -n "${INSTALLER_NAMESPACE}" --ignore-not-found=true

Long-term fix (proper architecture):
Move the secret into a Helm template (charts/osac/templates/config-as-code-secret.yaml) so Helm owns the resource properly.


Next Steps

Option A (fastest):

  • File follow-up OSAC ticket for the setup.sh fix
  • /pj-rehearse ack this PR (the path fix shouldn't be blocked by a separate downstream issue)
  • Merge this PR
  • OSAC team fixes setup.sh in a follow-up

Option B (cleanest):

  • Fix osac-installer's setup.sh first
  • Retry rehearsal after that merges
  • Then merge this PR

Which approach would you recommend?


📋 Detailed investigation report: helm-ownership-investigation.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants