[wip]OCPKUEUE-668: Add operator-upgrade-test Claude Code skill#2043
[wip]OCPKUEUE-668: Add operator-upgrade-test Claude Code skill#2043anahas-redhat wants to merge 1 commit into
Conversation
|
@anahas-redhat: This pull request references OCPKUEUE-668 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
WalkthroughAdds ChangesOperator Upgrade Test Skill
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 15✅ Passed checks (15 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: anahas-redhat The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hey @sohankunkerkar @kannon92 — PR for a Claude Code skill that automates operator upgrade testing. Would love your feedback on the approach. |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
.claude/skills/operator-upgrade-test/SKILL.md (2)
206-210: 💤 Low valueAdd markdown language specifier to fenced code block (MD040).
Line 207 starts a fenced code block without a language specifier. While this renders correctly, it prevents syntax highlighting and linter validation.
📝 Proposed fix
+```bash
=== Kueue Upgrade Test Configuration ===This enables syntax highlighting and bash-specific linting. </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In @.claude/skills/operator-upgrade-test/SKILL.md around lines 206 - 210, The
fenced code block starting with the "=== Kueue Upgrade Test Configuration ==="
content is missing a language specifier after the opening triple backticks. Add
"bash" as the language specifier immediately after the openingto enable syntax highlighting and bash-specific linting validation. This means changing the opening fence fromto ```bash.</details> <!-- cr-comment:v1:624bb03863cea42f25620591 --> _Source: Linters/SAST tools_ --- `1-50`: _⚡ Quick win_ **Verify pull secret and SSH key handling is secure in all phases.** The skill collects pull secrets and SSH keys from files and embeds them in install-config.yaml and API objects. Ensure that: 1. install-config.yaml file is not committed or persisted longer than the cluster creation 2. No credentials appear in logs or error messages from Phases 1-8 3. Pull secret and SSH key values are not printed in results document (Phase 8) Add a note in Phase 9 (cluster cleanup) to remove the install-config.yaml file and any temporary credential files. <details> <summary>🔒 Suggested addition to Phase 9 (cluster cleanup)</summary> ```bash # Clean up sensitive files rm -f <work-dir>/<cluster-name>/install-config.yaml rm -f <work-dir>/<cluster-name>/install-config.yaml.bak rm -f /tmp/openshift-install-*.tar.gz* ``` Also ensure the results document (Phase 8) does not include: - Pull secret contents - SSH key contents - GCP service account key paths - Any other credential material ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.claude/skills/operator-upgrade-test/SKILL.md around lines 1 - 50, The skill must ensure that sensitive credentials (pull secrets, SSH keys, and GCP service account keys) are not exposed in logs, error messages, or the results document. Add explicit cleanup instructions to Phase 9 to remove the install-config.yaml file and any temporary credential files created during cluster creation. In Phase 8 (results document generation), explicitly filter out and document which credential values must not be included. Additionally, review all log statements and error handling in Phases 1-8 to ensure that when install-config.yaml or credential variables are referenced, their actual contents are never printed or logged—only their file paths or variable names should appear in output. ``` </details> <!-- cr-comment:v1:f9613ab14fca752abee56726 --> _Source: Coding guidelines_ </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.Inline comments:
In @.claude/skills/operator-upgrade-test/SKILL.md:
- Around line 477-498: Replace the three hardcoded
/releases/latest/download/
URLs for cert-manager (line 479), JobSet (line 488), and LeaderWorkerSet (line
- with pinned version numbers instead of "latest". Define version variables
at the top of the document for cert-manager, jobset, and lws, then use those
variables in the oc apply commands. Additionally, add checksum verification
after downloading each manifest by comparing against published checksums, or
consider including the pinned manifests directly in the skill repository to
eliminate the need for remote downloads entirely. Document the version pinning
strategy at the beginning of Phase 0 or Phase 3 to explain how these specific
versions are selected and maintained.
- Around line 1-100: The bundle image discovery in Phase 0e clones the kueue-fbc
repository and queries the quay.io API without pinning git revisions or
validating responses, creating a supply chain risk. Pin the git clone operation
to a specific commit hash or version tag instead of using--depth 1without a
ref, and add response validation for the quay.io API queries by comparing
against a known-good manifest or checksum before using the discovered bundle
image references. This ensures that even if the remote repository or API is
compromised, the test will detect the tampering and fail safely rather than
deploying a malicious bundle.- Around line 261-270: The openshift-install binary download section lacks
checksum verification, creating a security vulnerability where a compromised or
intercepted binary could be executed without validation. Add a step to download
the sha256sum.txt file from the same mirror location as the
openshift-install-mac-arm64-.tar.gz file, then verify the
downloaded tar.gz file against the published checksum using sha256sum before
extracting and executing it. The verification should be performed immediately
after the curl download and before the tar extraction step, ensuring the binary
integrity is confirmed before any execution occurs.- Around line 140-155: The IAM role assignment in the service account
auto-creation block grants 9 roles including iam.securityAdmin and
iam.roleAdmin, which exceed the least-privilege requirement for OpenShift
installer. Remove the unnecessary IAM roles from the for loop and retain only
compute.admin, dns.admin, compute.loadBalancerAdmin, and storage.admin.
Additionally, add a flag variable (such as CREATED_SA) to track whether the
service account was auto-created, then reference this flag in Phase 9 to add
cleanup logic that deletes the service account and its keys using gcloud iam
service-accounts delete when cluster deletion occurs.
Nitpick comments:
In @.claude/skills/operator-upgrade-test/SKILL.md:
- Around line 206-210: The fenced code block starting with the "=== Kueue
Upgrade Test Configuration ===" content is missing a language specifier after
the opening triple backticks. Add "bash" as the language specifier immediately
after the openingto enable syntax highlighting and bash-specific linting validation. This means changing the opening fence fromto ```bash.- Around line 1-50: The skill must ensure that sensitive credentials (pull
secrets, SSH keys, and GCP service account keys) are not exposed in logs, error
messages, or the results document. Add explicit cleanup instructions to Phase 9
to remove the install-config.yaml file and any temporary credential files
created during cluster creation. In Phase 8 (results document generation),
explicitly filter out and document which credential values must not be included.
Additionally, review all log statements and error handling in Phases 1-8 to
ensure that when install-config.yaml or credential variables are referenced,
their actual contents are never printed or logged—only their file paths or
variable names should appear in output.</details> <details> <summary>🪄 Autofix (Beta)</summary> Fix all unresolved CodeRabbit comments on this PR: - [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended) - [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes </details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Repository YAML (base), Central YAML (inherited) **Review profile**: CHILL **Plan**: Enterprise **Run ID**: `6afaee1d-3b71-4517-8d6a-f43cda5119ee` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between fbfdf061d0b6fa1356191dd1d1062629ce315847 and 1b0d23420efb4b8a662658b0575c49e1481f712c. </details> <details> <summary>📒 Files selected for processing (1)</summary> * `.claude/skills/operator-upgrade-test/SKILL.md` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Automates end-to-end Kueue operator upgrade testing (Part 2 of OCPKUEUE-668). Covers both uninstall scenarios (operator-only and full CR deletion), config migration checks, and results document generation. Implemented as a Claude Code skill — a natural language runbook that adapts to version-specific differences (schema changes, new fields, API deprecations) without requiring scripted logic for each case. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
a7e8e3f to
708d292
Compare
|
@anahas-redhat: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
.claude/skills/operator-upgrade-test/SKILL.md) that automates end-to-end Kueue operator upgrade testing (Part 2 of OCPKUEUE-668)What the skill does
Test plan
Jira: https://redhat.atlassian.net/browse/OCPKUEUE-668
🤖 Generated with Claude Code
Summary by CodeRabbit
Tests
Documentation