-
Notifications
You must be signed in to change notification settings - Fork 51
NO-JIRA: capiinstaller: keep caBundle for CRDs and conversion webhooks #423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NO-JIRA: capiinstaller: keep caBundle for CRDs and conversion webhooks #423
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
WalkthroughAdds CRD-specific reconciliation: a new helper to create/update CustomResourceDefinitions (preserving injected CA bundles, computing diffs, logging JSON patches, and reporting events) and controller changes to extract, parse, and apply CRD manifests before other provider components. Changes
Sequence Diagram(s)sequenceDiagram
participant Controller as Provider Installer Controller
participant Extractor as Manifest Extractor
participant Parser as YAML Parser
participant CRDHelper as CRD Helper\n(applyCustomResourceDefinitionV1Improved)
participant K8sAPI as Kubernetes API\n(CustomResourceDefinitions)
participant Recorder as Event Recorder
Controller->>Extractor: getProviderComponents(scheme, components)
Extractor-->>Controller: crdFilenames[], crdAssets{}, other components...
loop for each CRD manifest
Controller->>Parser: parse CRD YAML → apiextensionsv1.CustomResourceDefinition
Parser-->>Controller: CRD object
Controller->>CRDHelper: applyCustomResourceDefinitionV1Improved(ctx, client, recorder, crd)
CRDHelper->>K8sAPI: GET CustomResourceDefinition
alt not found
CRDHelper->>K8sAPI: CREATE CustomResourceDefinition
CRDHelper->>Recorder: event "Created"
else exists
CRDHelper->>CRDHelper: ensure CA bundle (if inject-cabundle annotated)
CRDHelper->>CRDHelper: resourcemerge.EnsureCustomResourceDefinitionV1 (compute changes)
alt changes detected
CRDHelper->>K8sAPI: PATCH/UPDATE CustomResourceDefinition
CRDHelper->>Recorder: event "Updated"
CRDHelper-->>CRDHelper: log JSON patch (v=2)
else no changes
CRDHelper->>Recorder: no event
end
end
CRDHelper-->>Controller: result / error
end
Controller->>Controller: apply remaining components (deployments, static)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
fd4e2a8 to
cb1b7b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
pkg/controllers/capiinstaller/apiextensions.go (1)
32-64: Function name typo:appplyCustomResourceDefinitionV1ImprovedThe helper’s logic looks consistent with the usual Apply* pattern (create on NotFound, merge + Update on change, patch logging and events), but the name has a triple “p”. Consider renaming to
applyCustomResourceDefinitionV1Improved(and updating the call sites) to keep things readable and grep‑friendly.pkg/controllers/capiinstaller/capi_installer_controller.go (1)
257-295: getProviderComponents return signature is getting unwieldyWith CRDs added,
getProviderComponentsnow returns six separate collections plus an error:func getProviderComponents(...) ( []string, map[string]string, // components []string, map[string]string, // deployments []string, map[string]string, // CRDs error, )This is still functional, but it’s becoming hard to read and easy to mix up ordering at call sites. Consider a small struct to group these maps/slices by role (components / deployments / CRDs) so the call site can be named-field based instead of positional.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
pkg/controllers/capiinstaller/apiextensions.go(1 hunks)pkg/controllers/capiinstaller/capi_installer_controller.go(3 hunks)
| // copyCRDConversionCABundle populates webhooks[].clientConfig.caBundle fields from existing resource if it was set before | ||
| // and is not set in present. This provides upgrade compatibility with service-ca-bundle operator. | ||
| func copyCRDConversionCABundle(from, to *apiextensionsv1.CustomResourceDefinition) { | ||
| if from.Spec.Conversion == nil || from.Spec.Conversion.Webhook == nil || from.Spec.Conversion.Webhook.ClientConfig == nil || from.Spec.Conversion.Webhook.ClientConfig.CABundle == nil { | ||
| return | ||
| } | ||
|
|
||
| if to.Spec.Conversion == nil || to.Spec.Conversion.Webhook == nil || to.Spec.Conversion.Webhook.ClientConfig == nil { | ||
| return | ||
| } | ||
|
|
||
| to.Spec.Conversion.Webhook.ClientConfig.CABundle = from.Spec.Conversion.Webhook.ClientConfig.CABundle | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid overwriting explicitly configured CRD conversion caBundle
The comment says “populate … if it was set before and is not set in present”, but the implementation currently overwrites to.Spec.Conversion.Webhook.ClientConfig.CABundle unconditionally whenever the target has a non‑nil ClientConfig. That will clobber any caBundle explicitly defined in the new manifest.
To only backfill when the new spec doesn’t set a bundle, gate on the target field as well:
func copyCRDConversionCABundle(from, to *apiextensionsv1.CustomResourceDefinition) {
- if from.Spec.Conversion == nil || from.Spec.Conversion.Webhook == nil || from.Spec.Conversion.Webhook.ClientConfig == nil || from.Spec.Conversion.Webhook.ClientConfig.CABundle == nil {
- return
- }
-
- if to.Spec.Conversion == nil || to.Spec.Conversion.Webhook == nil || to.Spec.Conversion.Webhook.ClientConfig == nil {
- return
- }
-
- to.Spec.Conversion.Webhook.ClientConfig.CABundle = from.Spec.Conversion.Webhook.ClientConfig.CABundle
+ if from.Spec.Conversion == nil ||
+ from.Spec.Conversion.Webhook == nil ||
+ from.Spec.Conversion.Webhook.ClientConfig == nil ||
+ from.Spec.Conversion.Webhook.ClientConfig.CABundle == nil {
+ return
+ }
+
+ if to.Spec.Conversion == nil ||
+ to.Spec.Conversion.Webhook == nil ||
+ to.Spec.Conversion.Webhook.ClientConfig == nil {
+ return
+ }
+
+ // Only backfill when the new manifest did not specify a CABundle itself.
+ if to.Spec.Conversion.Webhook.ClientConfig.CABundle == nil {
+ to.Spec.Conversion.Webhook.ClientConfig.CABundle = from.Spec.Conversion.Webhook.ClientConfig.CABundle
+ }
}(If you care about the “empty slice vs nil slice” distinction for []byte, you may also want to treat len(...) == 0 as “not set”.)
🤖 Prompt for AI Agents
In pkg/controllers/capiinstaller/apiextensions.go around lines 66 to 78, the
function currently unconditionally copies the CA bundle from the existing CRD
into the new CRD when ClientConfig exists, which can overwrite an explicitly
provided caBundle in the new manifest; change the logic to only backfill the
bundle when the target bundle is not set by checking
to.Spec.Conversion.Webhook.ClientConfig.CABundle == nil (and optionally treat
len(...) == 0 as not set for []byte) before assigning
from.Spec.Conversion.Webhook.ClientConfig.CABundle, keeping all existing nil
checks for nested structs.
|
/pipeline required |
|
Scheduling tests matching the |
|
/testwith openshift/cluster-api-provider-aws/main/regression-clusterinfra-aws-ipi-techpreview-capi openshift/cluster-api-provider-aws#582 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
pkg/controllers/capiinstaller/apiextensions.go (1)
89-94: Consider preserving explicitly set CABundle in new manifestWhen the new manifest explicitly sets a
CABundle, line 93 unconditionally overwrites it with the existing value. While this follows CVO behavior for service-ca managed bundles, consider adding a guard to only backfill when the new manifest doesn't specify a bundle:exc := existing.Spec.Conversion if exc != nil && exc.Webhook != nil && exc.Webhook.ClientConfig != nil { - req.Webhook.ClientConfig.CABundle = exc.Webhook.ClientConfig.CABundle + // Only backfill when the new manifest did not specify a CABundle itself. + if len(req.Webhook.ClientConfig.CABundle) == 0 { + req.Webhook.ClientConfig.CABundle = exc.Webhook.ClientConfig.CABundle + } }pkg/controllers/capiinstaller/capi_installer_controller.go (1)
185-203: Critical: CRD apply errors are silently ignoredThe return value from
appplyCustomResourceDefinitionV1Improvedis discarded, so CRD creation/update failures won't propagate. Additionally, line 199 uses%wto wraperrwhich isnilat that point (since parsing succeeded), producing a misleading error message.for _, c := range customResourceDefinitionFilenames { customResourceDefinitionManifest, ok := customResourceDefinitionAssets[c] if !ok { - panic("error finding custom resource definition manifest") + return fmt.Errorf("error finding custom resource definition manifest %q", c) } obj, err := yamlToRuntimeObject(r.Scheme, customResourceDefinitionManifest) if err != nil { return fmt.Errorf("error parsing CAPI provider CRD manifest %q: %w", c, err) } crd, ok := obj.(*apiextensionsv1.CustomResourceDefinition) if !ok { - return fmt.Errorf("error casting object to CustomResourceDefinition: %w", err) + return fmt.Errorf("error casting object to CustomResourceDefinition: got %T", obj) } - appplyCustomResourceDefinitionV1Improved(ctx, r.APIExtensionsClient.ApiextensionsV1(), events.NewInMemoryRecorder("cluster-capi-operator-capi-installer-apply-client", clock.RealClock{}), crd) + if _, _, err := applyCustomResourceDefinitionV1Improved(ctx, r.APIExtensionsClient.ApiextensionsV1(), events.NewInMemoryRecorder("cluster-capi-operator-capi-installer-apply-client", clock.RealClock{}), crd); err != nil { + return fmt.Errorf("error applying CAPI provider CRD %q: %w", crd.Name, err) + } }Note: Also update the function call to use the corrected name
applyCustomResourceDefinitionV1Improvedafter fixing the typo in apiextensions.go.
🧹 Nitpick comments (1)
pkg/controllers/capiinstaller/capi_installer_controller.go (1)
257-257: Consider using a struct for return valuesThe function now returns 7 values which is unwieldy. Consider grouping related values into a struct for clarity:
type providerComponentsResult struct { componentsFilenames []string componentsAssets map[string]string deploymentsFilenames []string deploymentsAssets map[string]string crdFilenames []string crdAssets map[string]string }This is optional since the current pattern follows existing codebase conventions.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
pkg/controllers/capiinstaller/apiextensions.go(1 hunks)pkg/controllers/capiinstaller/capi_installer_controller.go(3 hunks)
| modified := false | ||
| existingCopy := existing.DeepCopy() | ||
|
|
||
| ensureCustomResourceDefinitionV1CaBundle(required, *existing) | ||
|
|
||
| resourcemerge.EnsureCustomResourceDefinitionV1(&modified, existingCopy, *required) | ||
| if !modified { | ||
| return existing, false, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Input mutation: required is modified as a side effect
ensureCustomResourceDefinitionV1CaBundle mutates the required parameter directly (line 49), which could cause unexpected behavior if the caller reuses the object. Consider making a deep copy of required before mutation, similar to how existingCopy is handled.
modified := false
existingCopy := existing.DeepCopy()
+ requiredCopy := required.DeepCopy()
- ensureCustomResourceDefinitionV1CaBundle(required, *existing)
+ ensureCustomResourceDefinitionV1CaBundle(requiredCopy, *existing)
- resourcemerge.EnsureCustomResourceDefinitionV1(&modified, existingCopy, *required)
+ resourcemerge.EnsureCustomResourceDefinitionV1(&modified, existingCopy, *requiredCopy)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| modified := false | |
| existingCopy := existing.DeepCopy() | |
| ensureCustomResourceDefinitionV1CaBundle(required, *existing) | |
| resourcemerge.EnsureCustomResourceDefinitionV1(&modified, existingCopy, *required) | |
| if !modified { | |
| return existing, false, nil | |
| } | |
| modified := false | |
| existingCopy := existing.DeepCopy() | |
| requiredCopy := required.DeepCopy() | |
| ensureCustomResourceDefinitionV1CaBundle(requiredCopy, *existing) | |
| resourcemerge.EnsureCustomResourceDefinitionV1(&modified, existingCopy, *requiredCopy) | |
| if !modified { | |
| return existing, false, nil | |
| } |
🤖 Prompt for AI Agents
In pkg/controllers/capiinstaller/apiextensions.go around lines 46-54,
ensureCustomResourceDefinitionV1CaBundle mutates the required object passed in;
to avoid side effects, make a deep copy of required (e.g., reqCopy :=
required.DeepCopy()) and use that copy when calling
ensureCustomResourceDefinitionV1CaBundle and the subsequent
resourcemerge.EnsureCustomResourceDefinitionV1 call so the original required
remains unchanged.
cb1b7b8 to
13c7f16
Compare
|
/pipeline required |
|
Scheduling tests matching the |
|
/testwith abort |
|
/testwith openshift/cluster-api-provider-aws/main/regression-clusterinfra-aws-ipi-techpreview-capi openshift/cluster-api-provider-aws#582 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/controllers/capiinstaller/capi_installer_controller.go (1)
255-295: Component classification and new CRD returns look good; consider updating the doc commentThe extended
getProviderComponentssignature and the new CRD-specific classification byKind == "CustomResourceDefinition"are sensible and integrate cleanly with the CRD-first apply path.Minor nit: the function comment still only mentions Deployments as being handled separately. Consider updating it to also mention CRDs being split into their own filename/asset collections for the CRD-specific apply.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
pkg/controllers/capiinstaller/apiextensions.go(1 hunks)pkg/controllers/capiinstaller/capi_installer_controller.go(3 hunks)
🔇 Additional comments (3)
pkg/controllers/capiinstaller/apiextensions.go (2)
32-64: CRD apply helper looks correct and aligned with existing apply patternsCreate vs. update flow, use of
WithCleanLabelsAndAnnotations/EnsureCustomResourceDefinitionV1, JSON patch logging, and event reporting are all consistent and avoid obvious reconciliation pitfalls. I don’t see functional issues here.
70-95: CABundle preservation logic is appropriate for service-ca–managed CRDsThe
ensureCustomResourceDefinitionV1CaBundlehelper correctly gates on the inject-cabundle annotation and webhook conversion strategy, then copies the existingCABundleinto the desired spec. This should prevent the operator from fighting the service-ca controller and resolve the CRD conversion webhook reconciliation loop.pkg/controllers/capiinstaller/capi_installer_controller.go (1)
180-203: Propagate CRD apply errors and fix the type assertion error messageTwo issues in this CRD loop:
- The result of
applyCustomResourceDefinitionV1Improvedis ignored, so a failed CRD create/update won't surface andapplyProviderComponentsmay continue and report overall success.- The type assertion error uses
%wwitherr, which may not be the appropriate error to wrap depending on the control flow.Suggest capturing the apply result and verifying the error handling is consistent with deployment and static component apply paths.
13c7f16 to
64d1b27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
pkg/controllers/capiinstaller/capi_installer_controller.go (1)
179-207: Fix CRD type assertion error message and consider avoiding panic on missing CRD assetThe new CRD-specific apply flow looks correct overall (classification → decode →
*apiextensionsv1.CustomResourceDefinitionassert →applyCustomResourceDefinitionV1Improvedwith error propagation). Two nits here:
In the type assertion failure path (Line 200), you format the error as:
return fmt.Errorf("error casting object to CustomResourceDefinition: %w", err)At this point
erris guaranteed to be the previous decode error (already checked) and thusnil, so the%wwrapping is misleading and doesn’t surface what actually went wrong (i.e., the runtime type). Prefer something like:return fmt.Errorf("error casting object to *apiextensionsv1.CustomResourceDefinition: got %T", obj)This makes debugging bad/unexpected manifests much easier. This also addresses the remaining part of the earlier review about the cast error message.
For the missing-asset case (Lines 189–192), you
panic("error finding custom resource definition manifest"). This mirrors the existing deployment path, but a panic here will crash the operator if there’s a bad provider bundle. It would be more robust to return a regular error instead and let reconcile mark the controller as Degraded, e.g.:if !ok { return fmt.Errorf("error finding custom resource definition manifest for %q", c) }Not strictly required for this PR but worth considering for operator resilience.
🧹 Nitpick comments (1)
pkg/controllers/capiinstaller/capi_installer_controller.go (1)
261-300: Consider a struct return type for getProviderComponents to reduce positional return complexity
getProviderComponentsnow returns six values plus an error:([]string, map[string]string, []string, map[string]string, []string, map[string]string, error)This is getting hard to read and easy to misuse if the ordering ever changes. A small follow-up refactor to introduce a result struct (e.g.
providerComponentswithStatic,Deployments,CRDsfields for filenames/assets) would improve clarity and make call sites more self-documenting, without changing behavior.Given this is internal and the current usage is correct, this can be deferred to a separate cleanup PR.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
pkg/controllers/capiinstaller/apiextensions.go(1 hunks)pkg/controllers/capiinstaller/capi_installer_controller.go(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/controllers/capiinstaller/apiextensions.go
|
/pipeline required |
|
Scheduling tests matching the |
|
/override ci/prow/e2e-openstack-ovn-techpreview Job is permafailing and failure is unrelated to this change. |
|
@chrischdi: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/label acknowledge-critical-fixes-only This fixes a critical issue of infinite reconciliation on CRDs which have a conversion webhook and a caBundle set. |
|
@damdo: Overrode contexts on behalf of damdo: ci/prow/e2e-openstack-ovn-techpreview DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@chrischdi: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
damdo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
|
Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: damdo The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/verified manually @huali9 was so kind and did set up a test cluster with this PR. The kube-apiserver failures in the logs are gone and also capa's logs were clean with this. Also the CRD's did not get updated indefinetly anymore. |
|
@chrischdi: The DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified by @chrischdi @huali9 was so kind and did set up a test cluster with this PR. The kube-apiserver failures in the logs are gone and also capa's logs were clean with this. Also the CRD's did not get updated indefinetly anymore. Also inspected the latest green jobs on this PR. |
|
@chrischdi: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
377ad99
into
openshift:main
Solves the issue of infinite reconciliation on CRDs which have a conversion webhook and a caBundle set.
CCAPIO unsets the cabundle which breaks the conversion webhook until the
openshift-service-cacontroller sets it again and CCAPIO again unsets it, ...Once openshift/library-go#2064 merged, we can make use of that function instead.
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.