fix(operator): move defaults to admission webhook#194
Conversation
There was a problem hiding this comment.
Code Review
This pull request transitions the application and persistence of default values from the operator's reconciliation loop to Kubernetes admission webhooks. It introduces a mechanism to pass environment-specific defaults to the webhooks via a new options structure and updates the reconciler to use in-memory defaulting for legacy objects. Additionally, the CLI setup process is enhanced to manage webhook TLS certificates and manifest injection. Review feedback identifies a missing yaml import in the setup package, suggests extending the CA certificate's lifespan for better rotation practices, and recommends using reliable path resolution for reading configuration files.
|
|
||
| func injectOperatorWebhookCABundle(webhookYAML, caBundlePEM []byte) ([]byte, error) { | ||
| caBundle := base64.StdEncoding.EncodeToString(caBundlePEM) | ||
| decoder := yaml.NewDecoder(bytes.NewReader(webhookYAML)) |
| caTemplate := x509.Certificate{ | ||
| SerialNumber: caSerialNumber, | ||
| Subject: pkix.Name{ | ||
| CommonName: operatorWebhookServiceName + "-ca", | ||
| }, | ||
| NotBefore: now.Add(-1 * time.Hour), | ||
| NotAfter: now.AddDate(1, 0, 0), | ||
| KeyUsage: x509.KeyUsageCertSign | x509.KeyUsageCRLSign, | ||
| BasicConstraintsValid: true, | ||
| IsCA: true, | ||
| } |
There was a problem hiding this comment.
The CA certificate and the serving certificate are configured with the same expiration date (1 year from now). It is a best practice for a Certificate Authority to have a longer lifetime than the certificates it signs to ensure continuity during rotation. Consider increasing the NotAfter duration for the caTemplate.
References
- Certificates signed by a CA should generally have a shorter lifespan than the CA itself to avoid simultaneous expiration issues.
| serviceYAML, err := os.ReadFile("config/webhook/service.yaml") | ||
| if err != nil { | ||
| return fmt.Errorf("read operator webhook service manifest: %w", err) | ||
| } | ||
| if err := kube.ApplyManifestContent(kubectl.CommandArgs, string(serviceYAML)); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| webhookYAML, err := os.ReadFile("config/webhook/manifests.yaml") |
There was a problem hiding this comment.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 907caeb45e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if requeue { | ||
| return ctrl.Result{}, nil | ||
| } | ||
| mcpServer = r.defaultedMCPServerForReconcile(mcpServer) |
There was a problem hiding this comment.
Reinstate reconcile-time validation fallback
The reconcile loop now defaults the object and proceeds without any MCPServer spec validation, but webhook registration is still conditional on MCP_ENABLE_WEBHOOKS (cmd/operator/main.go). In installs where that env var is unset/false (for example config/default style deployments or older operator pods), malformed CRs such as OAuth without required issuer fields or invalid rollout values bypass admission and reach reconciliation, causing late runtime/deployment failures instead of deterministic API rejection. Please keep controller-side validation as a fallback (or hard-require webhooks at startup) so non-webhook environments don’t silently accept invalid specs.
Useful? React with 👍 / 👎.
abc9f39 to
2cd2617
Compare
Make ensureOperatorWebhookTLSSecret idempotent: store ca.crt in the webhook Secret, read it back on both the kubectl and client-go deploy paths, and keep the existing CA + serving certificate when the chain still verifies for the webhook service DNS name and stays valid beyond a 30-day renewal window. Rotation now only happens on first install, near expiry, or when the Secret is missing/malformed, so setup re-runs no longer roll the operator and open a fail-closed admission window. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Persist CA key for serving cert renewal, set mutating webhook failurePolicy to Ignore, warn on wildcard grants, wait for rollout before caBundle patch, and add doctor expiry warning within the renewal window. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Pushed review fixes (also on `fix/pr-194-review`):
|
Co-authored-by: Cursor <cursoragent@cursor.com>
Rebuilt on current
mainafter the setup package refactor.internal/cli/setup/platform/Validation:
bash test/e2e/scenarios_test.shFixes #18.