[release-4.20] OCPBUGS-92038, OCPBUGS-92039, OCPBUGS-92040, OCPBUGS-92041, OCPBUGS-92042, OCPBUGS-92043, OCPBUGS-92044, OCPBUGS-92045: Backport noOLM Gateway API test coverage and upgrade tests#31322
Conversation
Add upgrade test validating Gateway API migration from OLM-based Istio to CIO-managed Sail Library during 4.21 to 4.22 upgrades. Setup creates Gateway/HTTPRoute with OLM provisioning and tests connectivity. Test validates migration: Gateway remains programmed, Istiod running, Istio CRDs stay OLM-managed, GatewayClass has CIO finalizer, Istio CR deleted, subscription persists. Teardown cleans up all resources. Cherry-picked from: cf1f826 openshift#30897
…ip logic The Gateway API upgrade test was calling g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters. Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup, avoiding the goroutine panic. Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support). https://redhat.atlassian.net/browse/OCPBUGS-83267 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Cherry-picked from: 8ef51c3 openshift#31000
The Gateway API controller tests tracked Gateways in a shared in-memory gateways slice, deleting them during AfterEach cleanup. However, openshift-tests distributes tests across separate parallel worker processes. The annotation-based checkAllTestsDone coordination works correctly because annotations are stored on the cluster-scoped GatewayClass, but the gateways slice is not shared across processes. The process that runs the final AfterEach cleanup has an empty gateways slice, so it deletes the GatewayClass and istiod but never deletes the Gateways created by other processes. This leaves gateway deployments orphaned on the cluster. As a secondary issue, even when gateways were deleted, the GatewayClass and istiod were removed without waiting for the gateway proxy deployments to be fully cleaned up by GC. Since the deployments have an owner reference to the Gateway (not a finalizer), the cascade deletion is asynchronous, creating a race where gateway pods lose their control plane and crash-loop. Fix both issues by cleaning up gateways at the individual test level using defer deleteGateway, which deletes the Gateway and waits for its proxy deployment to be removed by GC. Add deleteGateway and waitForGatewayDeploymentDeletion helpers shared by both the controller tests and the upgrade test Teardown. Cleanup errors now hard fail to surface leftover resources immediately rather than causing confusing downstream test failures. https://redhat.atlassian.net/browse/OCPBUGS-83281 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-Authored-By: Grant Spence <gspence@redhat.com> Co-Authored-By: Ishmam Amin <iamin@redhat.com> Cherry-picked from: 3f8a12d openshift#31023
Cherry-picked from: e29073f openshift#31023
Cherry-picked from: ca41c36 openshift#31023
Add retry logic to markTestDone to handle optimistic locking conflicts when updating GatewayClass annotations. The CIO actively manages the GatewayClass (updating conditions, status, finalizers) which can cause 409 Conflict errors when tests try to update annotations. Using RetryOnConflict ensures the test automatically retries with the latest resourceVersion when concurrent updates occur. Fixes flake: Operation cannot be fulfilled on gatewayclasses.gateway.networking.k8s.io "openshift-default": the object has been modified; please apply your changes to the latest version and try again https://redhat.atlassian.net/browse/OCPBUGS-81751 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Cherry-picked from: 8e4e43a openshift#30964
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@gcs278: This pull request references Jira Issue OCPBUGS-88295, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-82146, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-78330, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-85550, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: This pull request references Jira Issue OCPBUGS-88295, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-82146, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-78330, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-85550, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gcs278 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@gcs278: No Jira issue is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/assign @rhamini3 |
|
@gcs278: This pull request references Jira Issue OCPBUGS-92038, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92039, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92040, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92041, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92042, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: This pull request references Jira Issue OCPBUGS-92038, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92039, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92040, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92041, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92042, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92043, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92044, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-92045, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@gcs278: This pull request references Jira Issue OCPBUGS-92038, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92039, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92040, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92041, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92042, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92043, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-92044, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
This pull request references Jira Issue OCPBUGS-92045, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: This pull request references Jira Issue OCPBUGS-92038, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92039, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92040, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92041, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92042, which is invalid:
Comment This pull request references Jira Issue OCPBUGS-92043, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-92044, which is valid. 7 validation(s) were run on this bug
This pull request references Jira Issue OCPBUGS-92045, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/testwith openshift/origin/release-4.20/e2e-gcp-ovn openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
/testwith openshift/origin/release-4.20/e2e-gcp-ovn-upgrade openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
This is a subset of payload jobs that helps proves merging these tests won't break anything: /payload-job periodic-ci-openshift-release-main-ci-4.20-e2e-aws-ovn-upgrade-out-of-change |
|
@gcs278: trigger 8 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7d1861c0-703a-11f1-904d-bead4ca4813c-0 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-4.20-upgrade-from-stable-4.19-e2e-gcp-ovn-upgrade openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/fdb1ad50-703a-11f1-8ca6-25963c2f707b-0 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/02cda910-703b-11f1-8202-b9ab34f52f2b-0 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-nightly-4.20-upgrade-from-stable-4.19-e2e-metal-ipi-upgrade-ovn-ipv6 openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/09d8a390-703b-11f1-9aa0-93a802cd5146-0 |
|
re-running testwith jobs because of merge conflict in FG promotion PR |
|
/testwith openshift/origin/release-4.20/e2e-gcp-ovn openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
/testwith openshift/origin/release-4.20/e2e-gcp-ovn-upgrade openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-4.20-upgrade-from-stable-4.19-e2e-gcp-ovn-upgrade openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/fcafd840-703b-11f1-962a-ea9f67a1d4d1-0 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/06111ed0-703c-11f1-940d-9b8efc8783d7-0 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-nightly-4.20-upgrade-from-stable-4.19-e2e-metal-ipi-upgrade-ovn-ipv6 openshift/cluster-ingress-operator#1459 openshift/api#2869 |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/10ca2d30-703c-11f1-9e9a-43d6051ce295-0 |
Summary
Backport of Gateway API noOLM (Sail Library) test coverage and upgrade tests to release-4.20, as part of the Sail Library backport (NE-2286). This provides full test coverage for the
GatewayAPIWithoutOLMfeature gate, including OLM-to-Sail-Library migration upgrade testing, test flake fixes, and parallel worker cleanup fixes.This is an identical backport to the 4.21 PR: #31232
Depends on #31262
Cherry-picked PRs