get secure boot MCO GCP cluster#80817
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughA Prow job identifier in the 4.22 amd64-nightly CI config is renamed and expanded by appending ChangesJob Configuration and Debug Wait Gates
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ptalgulk01 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh`:
- Around line 7-8: The export statement at line 7 prepends the writable /tmp
directory to PATH before trusted system paths, creating a security vulnerability
for command shadowing. Change the PATH export statement to append /tmp to the
end of the PATH variable instead (use $PATH:/tmp/) so that trusted directories
are searched first before the writable /tmp directory.
- Around line 10-14: The unconditional debug gate using the while loop that
waits for the `/tmp/continue` file can cause CI runs to deadlock indefinitely.
To fix this, wrap the entire while loop block (the echo statement and the
do-done loop checking for `/tmp/continue`) with a conditional check for an
environment variable to make the debug gate opt-in (e.g., check if DEBUG_MODE or
DEBUG_GATE is set). Additionally, add a bounded timeout mechanism to prevent
infinite waiting—either count iterations of the sleep command or use a timeout
wrapper so that if the debug flag is enabled but the continue file is never
created, the script will proceed after a reasonable duration rather than hanging
until the job times out. This ensures regular CI runs proceed normally while
allowing debugging only when explicitly enabled with a safety timeout.
In
`@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh`:
- Around line 164-165: The export statement on line 164 prepends /tmp to PATH,
creating a command-hijacking vulnerability. Remove the export PATH=/tmp/:$PATH
line entirely, or if /tmp is genuinely needed, append it to the end of PATH
instead of prepending it (export PATH=$PATH:/tmp). This ensures system paths are
searched first before the world-writable /tmp directory when resolving the
extended-platform-tests command in the which call on line 165.
- Around line 167-171: The debug wait loop checking for /tmp/continue is
unconditional and will block indefinitely in standard test runs until timeout.
Gate the entire block containing the echo "WAITING FOR DEBUG..." message and the
while loop with sleep 10 behind a debug environment variable check, and add a
maximum wait duration (such as a counter that limits iterations or a timeout
mechanism) to prevent indefinite blocking. This ensures the wait loop only
activates when debugging is explicitly enabled and will fail fast after a
reasonable timeout instead of stalling the job.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 35e03e43-12f3-4702-9e25-33a9e6528714
⛔ Files ignored due to path filters (1)
ci-operator/jobs/openshift/openshift-tests-private/openshift-openshift-tests-private-release-4.22-periodics.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (3)
ci-operator/config/openshift/openshift-tests-private/openshift-openshift-tests-private-release-4.22__amd64-nightly.yamlci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.shci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh
| echo "WAITING FOR DEBUG..." | ||
| while [ ! -f "/tmp/continue" ] | ||
| do | ||
| sleep 10 | ||
| done |
There was a problem hiding this comment.
Unconditional debug gate can deadlock longduration jobs.
Lines [10]-[14] block forever unless /tmp/continue is manually created, so regular CI runs can hang until timeout. Make this gate opt-in (env flag) and add a bounded timeout.
Suggested fix
-echo "WAITING FOR DEBUG..."
-while [ ! -f "/tmp/continue" ]
-do
- sleep 10
-done
+if [[ "${DEBUG_HOLD:-false}" == "true" ]]; then
+ echo "WAITING FOR DEBUG (max ${DEBUG_HOLD_TIMEOUT_SECONDS:-900}s)..."
+ timeout "${DEBUG_HOLD_TIMEOUT_SECONDS:-900}" bash -c 'until [[ -f /tmp/continue ]]; do sleep 10; done'
+fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo "WAITING FOR DEBUG..." | |
| while [ ! -f "/tmp/continue" ] | |
| do | |
| sleep 10 | |
| done | |
| if [[ "${DEBUG_HOLD:-false}" == "true" ]]; then | |
| echo "WAITING FOR DEBUG (max ${DEBUG_HOLD_TIMEOUT_SECONDS:-900}s)..." | |
| timeout "${DEBUG_HOLD_TIMEOUT_SECONDS:-900}" bash -c 'until [[ -f /tmp/continue ]]; do sleep 10; done' | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh`
around lines 10 - 14, The unconditional debug gate using the while loop that
waits for the `/tmp/continue` file can cause CI runs to deadlock indefinitely.
To fix this, wrap the entire while loop block (the echo statement and the
do-done loop checking for `/tmp/continue`) with a conditional check for an
environment variable to make the debug gate opt-in (e.g., check if DEBUG_MODE or
DEBUG_GATE is set). Additionally, add a bounded timeout mechanism to prevent
infinite waiting—either count iterations of the sleep command or use a timeout
wrapper so that if the debug flag is enabled but the continue file is never
created, the script will proceed after a reasonable duration rather than hanging
until the job times out. This ensures regular CI runs proceed normally while
allowing debugging only when explicitly enabled with a safety timeout.
| echo "WAITING FOR DEBUG..." | ||
| while [ ! -f "/tmp/continue" ] | ||
| do | ||
| sleep 10 | ||
| done |
There was a problem hiding this comment.
Debug wait loop is a blocking failure path in standard test runs.
Lines [167]-[171] create an unconditional wait on /tmp/continue, which can stall jobs until timeout. Gate this behind a debug env var and enforce a max wait duration.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh`
around lines 167 - 171, The debug wait loop checking for /tmp/continue is
unconditional and will block indefinitely in standard test runs until timeout.
Gate the entire block containing the echo "WAITING FOR DEBUG..." message and the
while loop with sleep 10 behind a debug environment variable check, and add a
maximum wait duration (such as a counter that limits iterations or a timeout
mechanism) to prevent indefinite blocking. This ensures the wait loop only
activates when debugging is explicitly enabled and will fail fast after a
reasonable timeout instead of stalling the job.
|
/pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-gcp-ipi-confidential-secureboot-mini-perm-f14-mco |
|
@ptalgulk01: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
095775d to
4b933b6
Compare
|
/pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-gcp-ipi-confidential-secureboot-mini-perm-f14-mco |
|
@ptalgulk01: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
4b933b6 to
903d2fc
Compare
|
/pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-gcp-ipi-confidential-secureboot-mini-perm-f14-mco |
|
@ptalgulk01: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh (2)
7-7:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid prioritizing writable
/tmpinPATH.Line 7 puts
/tmpbefore trusted system paths, which enables command shadowing risk in CI pods.Suggested fix
-export PATH=/tmp/:$PATH +export PATH="${PATH}:/tmp"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh` at line 7, The export PATH statement on line 7 prepends `/tmp/` to the PATH, creating a command shadowing security risk by allowing malicious scripts in /tmp to override legitimate system commands. Remove the `/tmp/:` prefix from the PATH export or, if /tmp binaries are needed, append `/tmp` to the end of PATH using `export PATH=$PATH:/tmp` instead to preserve the priority of trusted system paths.
10-35:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGate the debug wait so normal runs don’t stall by default.
This block is always active and can delay every run up to 3 hours before tests start. Make it opt-in and keep the timeout only for debug sessions.
Suggested fix
-# Debug wait with timeout (default 3 hours) -DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-10800}" -echo "==========================================" -echo "DEBUG WAIT ACTIVE - Cluster ready for debugging" -echo "==========================================" -echo "Timeout: ${DEBUG_WAIT_TIMEOUT} seconds" -echo "To continue tests, create the signal file:" -echo " oc rsh <test-pod>" -echo " touch /tmp/continue" -echo "==========================================" - -elapsed=0 -while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] -do - sleep 10 - elapsed=$((elapsed + 10)) - if [ $((elapsed % 300)) -eq 0 ]; then - echo "Debug wait: $elapsed / $DEBUG_WAIT_TIMEOUT seconds elapsed..." - fi -done - -if [ -f "/tmp/continue" ]; then - echo "Continue signal received. Proceeding with tests..." -else - echo "Debug wait timeout reached. Proceeding with tests..." -fi +if [[ "${DEBUG_WAIT_ENABLED:-false}" == "true" ]]; then + DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-10800}" + echo "==========================================" + echo "DEBUG WAIT ACTIVE - Cluster ready for debugging" + echo "==========================================" + echo "Timeout: ${DEBUG_WAIT_TIMEOUT} seconds" + echo "To continue tests, create the signal file:" + echo " oc rsh <test-pod>" + echo " touch /tmp/continue" + echo "==========================================" + + elapsed=0 + while [[ ! -f "/tmp/continue" ]] && (( elapsed < DEBUG_WAIT_TIMEOUT )); do + sleep 10 + elapsed=$((elapsed + 10)) + if (( elapsed % 300 == 0 )); then + echo "Debug wait: $elapsed / $DEBUG_WAIT_TIMEOUT seconds elapsed..." + fi + done + + if [[ -f "/tmp/continue" ]]; then + echo "Continue signal received. Proceeding with tests..." + else + echo "Debug wait timeout reached. Proceeding with tests..." + fi +fi🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh` around lines 10 - 35, The debug wait block starting with the echo statement about DEBUG WAIT ACTIVE is currently always executed, which can stall normal test runs for up to 3 hours. Wrap the entire debug wait block (including all echo statements, the while loop that checks for /tmp/continue, and the final if-else condition) in a conditional check that only executes this code when an opt-in environment variable is explicitly set (such as DEBUG_WAIT=true). This ensures the debug wait mechanism only activates during intentional debug sessions while allowing normal runs to proceed without delays.ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh (2)
167-192:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake debug wait opt-in instead of always-on.
The wait gate currently runs for every invocation and can delay regular test jobs by up to 10800 seconds.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh` around lines 167 - 192, The debug wait block unconditionally executes and waits for the continue signal or timeout, which delays regular test jobs. To make this opt-in, wrap the entire debug wait logic (from the DEBUG_WAIT_TIMEOUT variable assignment through the final if-else block checking for the continue signal) in a conditional that only executes when an opt-in environment variable is explicitly set to enable debugging. Add a check at the start of this block such that the debug wait only activates when a variable like DEBUG_MODE or similar is explicitly enabled, allowing regular test runs to skip this waiting period entirely by default.
164-164:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not prepend
/tmpahead of trusted binary paths.Line 164 increases command-hijack risk by searching writable
/tmpfirst.Suggested fix
-export PATH=/tmp/:$PATH +export PATH="${PATH}:/tmp"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh` at line 164, Remove `/tmp` from being prepended to the PATH in the export statement at line 164. The current `export PATH=/tmp/:$PATH` creates a command hijacking vulnerability by allowing any executable in the world-writable `/tmp` directory to be executed before trusted system binaries. Either remove this line entirely if `/tmp` is not needed in the PATH, or if absolutely necessary, append it to the end of PATH instead using `export PATH=$PATH:/tmp/` to keep trusted system binary paths first in the search order.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh`:
- Line 22: The while loop condition on line 22 uses the test bracket [ ] for
numeric comparison with the environment-derived variable $DEBUG_WAIT_TIMEOUT,
which can fail if the variable contains malformed or non-numeric values. Replace
the numeric comparison portion using arithmetic evaluation (( ... )) instead of
[ ] brackets, as arithmetic context is safer and more robust for numeric
comparisons with potentially invalid input. Keep the file existence check as-is
and combine both conditions appropriately in the while loop.
In
`@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh`:
- Line 179: The while loop condition uses the test command syntax [ $elapsed -lt
$DEBUG_WAIT_TIMEOUT ] which triggers shellcheck SC2086 concerns. Replace this
conditional test with bash arithmetic comparison syntax (( elapsed <
DEBUG_WAIT_TIMEOUT )) instead. Update line 179 to use arithmetic comparison
operators within double parentheses, removing the square brackets and changing
the comparison operator from -lt to <. This is safer and avoids variable
expansion issues with env-derived input.
---
Duplicate comments:
In
`@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh`:
- Line 7: The export PATH statement on line 7 prepends `/tmp/` to the PATH,
creating a command shadowing security risk by allowing malicious scripts in /tmp
to override legitimate system commands. Remove the `/tmp/:` prefix from the PATH
export or, if /tmp binaries are needed, append `/tmp` to the end of PATH using
`export PATH=$PATH:/tmp` instead to preserve the priority of trusted system
paths.
- Around line 10-35: The debug wait block starting with the echo statement about
DEBUG WAIT ACTIVE is currently always executed, which can stall normal test runs
for up to 3 hours. Wrap the entire debug wait block (including all echo
statements, the while loop that checks for /tmp/continue, and the final if-else
condition) in a conditional check that only executes this code when an opt-in
environment variable is explicitly set (such as DEBUG_WAIT=true). This ensures
the debug wait mechanism only activates during intentional debug sessions while
allowing normal runs to proceed without delays.
In
`@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh`:
- Around line 167-192: The debug wait block unconditionally executes and waits
for the continue signal or timeout, which delays regular test jobs. To make this
opt-in, wrap the entire debug wait logic (from the DEBUG_WAIT_TIMEOUT variable
assignment through the final if-else block checking for the continue signal) in
a conditional that only executes when an opt-in environment variable is
explicitly set to enable debugging. Add a check at the start of this block such
that the debug wait only activates when a variable like DEBUG_MODE or similar is
explicitly enabled, allowing regular test runs to skip this waiting period
entirely by default.
- Line 164: Remove `/tmp` from being prepended to the PATH in the export
statement at line 164. The current `export PATH=/tmp/:$PATH` creates a command
hijacking vulnerability by allowing any executable in the world-writable `/tmp`
directory to be executed before trusted system binaries. Either remove this line
entirely if `/tmp` is not needed in the PATH, or if absolutely necessary, append
it to the end of PATH instead using `export PATH=$PATH:/tmp/` to keep trusted
system binary paths first in the search order.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 6b79a8d2-526c-4aca-bab5-0f5b2277a875
⛔ Files ignored due to path filters (1)
ci-operator/jobs/openshift/openshift-tests-private/openshift-openshift-tests-private-release-4.22-periodics.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (3)
ci-operator/config/openshift/openshift-tests-private/openshift-openshift-tests-private-release-4.22__amd64-nightly.yamlci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.shci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- ci-operator/config/openshift/openshift-tests-private/openshift-openshift-tests-private-release-4.22__amd64-nightly.yaml
| echo "==========================================" | ||
|
|
||
| elapsed=0 | ||
| while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] |
There was a problem hiding this comment.
Harden timeout comparison against malformed env input.
Line 22 uses an unquoted env-derived numeric value in [; invalid values can break the condition. Prefer arithmetic (( ... )) (or validate integer first).
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 22-22: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh`
at line 22, The while loop condition on line 22 uses the test bracket [ ] for
numeric comparison with the environment-derived variable $DEBUG_WAIT_TIMEOUT,
which can fail if the variable contains malformed or non-numeric values. Replace
the numeric comparison portion using arithmetic evaluation (( ... )) instead of
[ ] brackets, as arithmetic context is safer and more robust for numeric
comparisons with potentially invalid input. Keep the file existence check as-is
and combine both conditions appropriately in the while loop.
Source: Linters/SAST tools
| echo "==========================================" | ||
|
|
||
| elapsed=0 | ||
| while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] |
There was a problem hiding this comment.
Use arithmetic comparison for DEBUG_WAIT_TIMEOUT.
Line 179 should avoid [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] with env-derived input; (( elapsed < DEBUG_WAIT_TIMEOUT )) is safer and avoids shellcheck SC2086 concerns.
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 179-179: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh`
at line 179, The while loop condition uses the test command syntax [ $elapsed
-lt $DEBUG_WAIT_TIMEOUT ] which triggers shellcheck SC2086 concerns. Replace
this conditional test with bash arithmetic comparison syntax (( elapsed <
DEBUG_WAIT_TIMEOUT )) instead. Update line 179 to use arithmetic comparison
operators within double parentheses, removing the square brackets and changing
the comparison operator from -lt to <. This is safer and avoids variable
expansion issues with env-derived input.
Source: Linters/SAST tools
|
@ptalgulk01, If the problem persists, please contact Test Platform. |
The script already has DEBUG_WAIT_TIMEOUT with a default value of 10800 (3 hours). The timeout parameter needs to be declared in the workflow, not the job config. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
/pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-gcp-ipi-confidential-secureboot-mini-perm-f14-mco |
|
@ptalgulk01: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
- Created 3 identical jobs: mco, mco-a, mco-b with staggered schedules - Reduced COMPUTE_NODE_REPLICAS to 2 (avoid zone provisioning issues) - Added debug wait in openshift-extended-test step (3hr timeout) - Debug wait activates AFTER cluster ready, BEFORE tests start - Signal file /tmp/continue to skip wait early Jobs schedule: - mco: 2nd & 16th at 2:44 AM - mco-a: 3rd & 17th at 3:47 AM - mco-b: 4th & 18th at 4:50 AM Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
[REHEARSALNOTIFIER]
A total of 4481 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-gcp-ipi-confidential-secureboot-mini-perm-f14-mco periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-gcp-ipi-confidential-secureboot-mini-perm-f14-mco-a periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-gcp-ipi-confidential-secureboot-mini-perm-f14-mco-b |
|
@ptalgulk01: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@ptalgulk01: job(s): either don't exist or were not found to be affected, and cannot be rehearsed |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh (1)
274-274: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueStyle: Unnecessary
$/{}in arithmetic context.Same shellcheck SC2004 style warnings as the primary script.
✨ Optional style fix
-echo "Timeout: ${DEBUG_WAIT_TIMEOUT} seconds ($(($DEBUG_WAIT_TIMEOUT / 3600)) hours)" +echo "Timeout: ${DEBUG_WAIT_TIMEOUT} seconds ($((DEBUG_WAIT_TIMEOUT / 3600)) hours)"- echo "Debug wait timeout reached after $(($DEBUG_WAIT_TIMEOUT / 3600)) hours. Proceeding with tests..." + echo "Debug wait timeout reached after $((DEBUG_WAIT_TIMEOUT / 3600)) hours. Proceeding with tests..."Also applies to: 301-301
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh` at line 274, The arithmetic expansion contains unnecessary dollar sign syntax triggering SC2004 shellcheck warning. In the echo statement at line 274 and its equivalent at line 301, remove the dollar sign prefix before DEBUG_WAIT_TIMEOUT within the arithmetic expansion context. Inside $((...)), variable references do not require the $ prefix, so change $(($DEBUG_WAIT_TIMEOUT / 3600)) to use just the variable name directly in the arithmetic expression.ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh (1)
280-280: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueStyle: Unnecessary
$/{}in arithmetic context.Lines 280 and 307 use
$(($DEBUG_WAIT_TIMEOUT / 3600))which works but is stylistically inconsistent. ShellCheck recommends removing the$prefix inside$((...)).✨ Optional style fix
-echo "Timeout: ${DEBUG_WAIT_TIMEOUT} seconds ($(($DEBUG_WAIT_TIMEOUT / 3600)) hours)" +echo "Timeout: ${DEBUG_WAIT_TIMEOUT} seconds ($((DEBUG_WAIT_TIMEOUT / 3600)) hours)"- echo "Debug wait timeout reached after $(($DEBUG_WAIT_TIMEOUT / 3600)) hours. Proceeding with tests..." + echo "Debug wait timeout reached after $((DEBUG_WAIT_TIMEOUT / 3600)) hours. Proceeding with tests..."Also applies to: 307-307
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh` at line 280, The arithmetic expansion expressions on line 280 and line 307 use an unnecessary `$` prefix before the variable name inside the `$((...))` syntax. Remove the `$` prefix from `$DEBUG_WAIT_TIMEOUT` inside the arithmetic expansion, changing `$(($DEBUG_WAIT_TIMEOUT / 3600))` to `$((DEBUG_WAIT_TIMEOUT / 3600))` on both occurrences to follow shell style conventions where variable prefixes are not needed within arithmetic expansion contexts.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh`:
- Line 287: In the while loop condition on the line with the timeout check,
replace the test bracket expression with unquoted variables for numeric
comparison with arithmetic evaluation syntax. Specifically, change the numeric
comparison of the elapsed variable against DEBUG_WAIT_TIMEOUT from using test
brackets `[ $elapsed -lt $DEBUG_WAIT_TIMEOUT ]` to using arithmetic evaluation
`(( elapsed < DEBUG_WAIT_TIMEOUT ))` to ensure the comparison is hardened
against malformed input.
- Around line 268-303: The DEBUG_WAIT_TIMEOUT variable in the debug wait logic
is defaulting to 10800 seconds (3 hours), which breaks backward compatibility
for all jobs using this shared step. Change the default value in the parameter
expansion from 10800 to 0 in the line
`DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-10800}"` so that no debug wait occurs
by default. MCO job-specific configurations can then explicitly set
DEBUG_WAIT_TIMEOUT to 10800 in their environment configuration if they need the
3-hour debug wait behavior.
In
`@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh`:
- Line 293: The numeric comparison using the test operator [ $elapsed -lt
$DEBUG_WAIT_TIMEOUT ] is vulnerable to failures with non-numeric input in the
variables. Replace the test operator expression with arithmetic evaluation using
double parentheses (( $elapsed < $DEBUG_WAIT_TIMEOUT )) in the while loop
condition. This provides safer and more robust numeric comparison in bash
without requiring variable quoting and handles edge cases with malformed input
more gracefully.
- Around line 274-309: The DEBUG_WAIT_TIMEOUT variable defaults to 10800
seconds, causing all jobs using this shared step-registry script to block for up
to 3 hours and breaking backward compatibility across the CI infrastructure.
Change the default value of DEBUG_WAIT_TIMEOUT from 10800 to 0 to maintain
backward compatibility, ensuring the debug wait only activates when explicitly
enabled. This allows existing jobs to proceed normally without modification
while new jobs or specific test configurations can explicitly set
DEBUG_WAIT_TIMEOUT to 10800 in their environment configuration to enable the
debug wait behavior.
---
Nitpick comments:
In
`@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh`:
- Line 274: The arithmetic expansion contains unnecessary dollar sign syntax
triggering SC2004 shellcheck warning. In the echo statement at line 274 and its
equivalent at line 301, remove the dollar sign prefix before DEBUG_WAIT_TIMEOUT
within the arithmetic expansion context. Inside $((...)), variable references do
not require the $ prefix, so change $(($DEBUG_WAIT_TIMEOUT / 3600)) to use just
the variable name directly in the arithmetic expression.
In
`@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh`:
- Line 280: The arithmetic expansion expressions on line 280 and line 307 use an
unnecessary `$` prefix before the variable name inside the `$((...))` syntax.
Remove the `$` prefix from `$DEBUG_WAIT_TIMEOUT` inside the arithmetic
expansion, changing `$(($DEBUG_WAIT_TIMEOUT / 3600))` to `$((DEBUG_WAIT_TIMEOUT
/ 3600))` on both occurrences to follow shell style conventions where variable
prefixes are not needed within arithmetic expansion contexts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 4c513e07-2203-4ff6-9d96-4fb50b0cea62
⛔ Files ignored due to path filters (1)
ci-operator/jobs/openshift/openshift-tests-private/openshift-openshift-tests-private-release-4.22-periodics.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (3)
ci-operator/config/openshift/openshift-tests-private/openshift-openshift-tests-private-release-4.22__amd64-nightly.yamlci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.shci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh
| # Debug wait with timeout (default 3 hours) - placed AFTER cluster is ready | ||
| DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-10800}" | ||
| echo "==========================================" | ||
| echo "DEBUG WAIT ACTIVE - Cluster ready for debugging" | ||
| echo "==========================================" | ||
| echo "Cluster is fully provisioned and ready!" | ||
| echo "Timeout: ${DEBUG_WAIT_TIMEOUT} seconds ($(($DEBUG_WAIT_TIMEOUT / 3600)) hours)" | ||
| echo "" | ||
| echo "To access the cluster:" | ||
| echo " 1. Find this pod in Prow job logs" | ||
| echo " 2. oc rsh -n test-credentials <pod-name>" | ||
| echo " 3. export KUBECONFIG=\${SHARED_DIR}/kubeconfig" | ||
| echo " 4. oc get nodes" | ||
| echo "" | ||
| echo "To continue tests, create the signal file:" | ||
| echo " touch /tmp/continue" | ||
| echo "==========================================" | ||
|
|
||
| elapsed=0 | ||
| while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] | ||
| do | ||
| sleep 10 | ||
| elapsed=$((elapsed + 10)) | ||
| if [ $((elapsed % 300)) -eq 0 ]; then | ||
| hours=$((elapsed / 3600)) | ||
| minutes=$(((elapsed % 3600) / 60)) | ||
| echo "Debug wait: ${hours}h ${minutes}m elapsed (${elapsed}s / ${DEBUG_WAIT_TIMEOUT}s)..." | ||
| fi | ||
| done | ||
|
|
||
| if [ -f "/tmp/continue" ]; then | ||
| echo "Continue signal received. Proceeding with tests..." | ||
| else | ||
| echo "Debug wait timeout reached after $(($DEBUG_WAIT_TIMEOUT / 3600)) hours. Proceeding with tests..." | ||
| fi | ||
|
|
There was a problem hiding this comment.
Critical: Unconditional debug wait breaks all longduration jobs using this shared step.
Identical issue to the primary test script: this shared step-registry script now has a 3-hour default wait that affects all jobs using the longduration test chain, not just the new MCO jobs. This breaks backward compatibility.
🔧 Recommended fix: Default to zero
-DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-10800}"
+DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-0}"
+
+if [ "$DEBUG_WAIT_TIMEOUT" -eq 0 ]; then
+ echo "Debug wait disabled (DEBUG_WAIT_TIMEOUT=0)"
+else
echo "=========================================="Then set DEBUG_WAIT_TIMEOUT: "10800" explicitly in your MCO job env config.
🧰 Tools
🪛 Shellcheck (0.11.0)
[style] 274-274:
(SC2004)
[info] 287-287: Double quote to prevent globbing and word splitting.
(SC2086)
[style] 301-301:
(SC2004)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh`
around lines 268 - 303, The DEBUG_WAIT_TIMEOUT variable in the debug wait logic
is defaulting to 10800 seconds (3 hours), which breaks backward compatibility
for all jobs using this shared step. Change the default value in the parameter
expansion from 10800 to 0 in the line
`DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-10800}"` so that no debug wait occurs
by default. MCO job-specific configurations can then explicitly set
DEBUG_WAIT_TIMEOUT to 10800 in their environment configuration if they need the
3-hour debug wait behavior.
| echo "==========================================" | ||
|
|
||
| elapsed=0 | ||
| while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] |
There was a problem hiding this comment.
Harden numeric comparison against malformed input.
Same issue as the primary script: use arithmetic evaluation (( ... )) instead of test brackets with unquoted variable.
🔧 Recommended fix
-while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ]
+while [ ! -f "/tmp/continue" ] && (( elapsed < DEBUG_WAIT_TIMEOUT ))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] | |
| while [ ! -f "/tmp/continue" ] && (( elapsed < DEBUG_WAIT_TIMEOUT )) |
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 287-287: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh`
at line 287, In the while loop condition on the line with the timeout check,
replace the test bracket expression with unquoted variables for numeric
comparison with arithmetic evaluation syntax. Specifically, change the numeric
comparison of the elapsed variable against DEBUG_WAIT_TIMEOUT from using test
brackets `[ $elapsed -lt $DEBUG_WAIT_TIMEOUT ]` to using arithmetic evaluation
`(( elapsed < DEBUG_WAIT_TIMEOUT ))` to ensure the comparison is hardened
against malformed input.
| # Debug wait with timeout (default 3 hours) - placed AFTER cluster is ready | ||
| DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-10800}" | ||
| echo "==========================================" | ||
| echo "DEBUG WAIT ACTIVE - Cluster ready for debugging" | ||
| echo "==========================================" | ||
| echo "Cluster is fully provisioned and ready!" | ||
| echo "Timeout: ${DEBUG_WAIT_TIMEOUT} seconds ($(($DEBUG_WAIT_TIMEOUT / 3600)) hours)" | ||
| echo "" | ||
| echo "To access the cluster:" | ||
| echo " 1. Find this pod in Prow job logs" | ||
| echo " 2. oc rsh -n test-credentials <pod-name>" | ||
| echo " 3. export KUBECONFIG=\${SHARED_DIR}/kubeconfig" | ||
| echo " 4. oc get nodes" | ||
| echo "" | ||
| echo "To continue tests, create the signal file:" | ||
| echo " touch /tmp/continue" | ||
| echo "==========================================" | ||
|
|
||
| elapsed=0 | ||
| while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] | ||
| do | ||
| sleep 10 | ||
| elapsed=$((elapsed + 10)) | ||
| if [ $((elapsed % 300)) -eq 0 ]; then | ||
| hours=$((elapsed / 3600)) | ||
| minutes=$(((elapsed % 3600) / 60)) | ||
| echo "Debug wait: ${hours}h ${minutes}m elapsed (${elapsed}s / ${DEBUG_WAIT_TIMEOUT}s)..." | ||
| fi | ||
| done | ||
|
|
||
| if [ -f "/tmp/continue" ]; then | ||
| echo "Continue signal received. Proceeding with tests..." | ||
| else | ||
| echo "Debug wait timeout reached after $(($DEBUG_WAIT_TIMEOUT / 3600)) hours. Proceeding with tests..." | ||
| fi | ||
|
|
There was a problem hiding this comment.
Critical: Unconditional debug wait breaks all jobs using this shared step.
This step-registry script is shared across many CI jobs, not just the new MCO variants. Setting DEBUG_WAIT_TIMEOUT to default 10800 seconds (3 hours) means every job that uses the openshift-e2e-test-qe chain will now block for up to 3 hours unless someone manually creates /tmp/continue in the pod.
This breaks backward compatibility and will cause unintended delays or timeouts across the CI infrastructure.
🔧 Recommended fix: Default to zero or gate behind env flag
Option 1 (Recommended): Default to 0 for backward compatibility
-DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-10800}"
+DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-0}"
+
+if [ "$DEBUG_WAIT_TIMEOUT" -eq 0 ]; then
+ echo "Debug wait disabled (DEBUG_WAIT_TIMEOUT=0)"
+else
echo "=========================================="
echo "DEBUG WAIT ACTIVE - Cluster ready for debugging"Then in your MCO job configs (lines 3750-3780 of the YAML file), explicitly set:
env:
DEBUG_WAIT_TIMEOUT: "10800"Option 2: Gate behind explicit enable flag
+if [[ "${ENABLE_DEBUG_WAIT}" != "true" ]]; then
+ echo "Debug wait disabled. Set ENABLE_DEBUG_WAIT=true to enable."
+else
DEBUG_WAIT_TIMEOUT="${DEBUG_WAIT_TIMEOUT:-10800}"
echo "=========================================="🧰 Tools
🪛 Shellcheck (0.11.0)
[style] 280-280:
(SC2004)
[info] 293-293: Double quote to prevent globbing and word splitting.
(SC2086)
[style] 307-307:
(SC2004)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh`
around lines 274 - 309, The DEBUG_WAIT_TIMEOUT variable defaults to 10800
seconds, causing all jobs using this shared step-registry script to block for up
to 3 hours and breaking backward compatibility across the CI infrastructure.
Change the default value of DEBUG_WAIT_TIMEOUT from 10800 to 0 to maintain
backward compatibility, ensuring the debug wait only activates when explicitly
enabled. This allows existing jobs to proceed normally without modification
while new jobs or specific test configurations can explicitly set
DEBUG_WAIT_TIMEOUT to 10800 in their environment configuration to enable the
debug wait behavior.
| echo "==========================================" | ||
|
|
||
| elapsed=0 | ||
| while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] |
There was a problem hiding this comment.
Harden numeric comparison against malformed input.
The unquoted $elapsed in the test condition can fail if DEBUG_WAIT_TIMEOUT contains non-numeric input. Use arithmetic evaluation for safer numeric comparison.
🔧 Recommended fix
-while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ]
+while [ ! -f "/tmp/continue" ] && (( elapsed < DEBUG_WAIT_TIMEOUT ))
do📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| while [ ! -f "/tmp/continue" ] && [ $elapsed -lt $DEBUG_WAIT_TIMEOUT ] | |
| while [ ! -f "/tmp/continue" ] && (( elapsed < DEBUG_WAIT_TIMEOUT )) | |
| do |
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 293-293: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh`
at line 293, The numeric comparison using the test operator [ $elapsed -lt
$DEBUG_WAIT_TIMEOUT ] is vulnerable to failures with non-numeric input in the
variables. Replace the test operator expression with arithmetic evaluation using
double parentheses (( $elapsed < $DEBUG_WAIT_TIMEOUT )) in the while loop
condition. This provides safer and more robust numeric comparison in bash
without requiring variable quoting and handles edge cases with malformed input
more gracefully.
|
/pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-gcp-ipi-confidential-secureboot-mini-perm-f14-mco |
|
@ptalgulk01: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@ptalgulk01: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary by CodeRabbit
This PR updates OpenShift CI infrastructure to add secure-boot (confidential computing) testing coverage for Machine Config Operator (MCO) on GCP IPI clusters as part of the OpenShift 4.22 private-release nightly jobs.
Key Changes
1. Split GCP secure-boot confidential-computing nightly job into 3 MCO variants
ci-operator/config/openshift/openshift-tests-private/openshift-openshift-tests-private-release-4.22__amd64-nightly.yaml:gcp-ipi-confidential-secureboot-mini-perm-f14-mcogcp-ipi-confidential-secureboot-mini-perm-f14-mco-agcp-ipi-confidential-secureboot-mini-perm-f14-mco-bcucushift-installer-rehearse-gcp-ipi-secureboot-confidential-computingworkflow and run theopenshift-e2e-test-qechain with:CONFIDENTIAL_COMPUTE: AMDEncryptedVirtualizationCOMPUTE_CONFIDENTIAL_COMPUTE: AMDEncryptedVirtualizationNestedPagingCOMPUTE_NODE_TYPE: n2d-standard-2CONTROL_PLANE_NODE_TYPE: c3d-standard-4COMPUTE_NODE_REPLICAS: "2"(reduced node replicas for the MCO secureboot test)GCP_INSTALL_USE_MINIMAL_PERMISSIONS: "yes"gcp-ipi-confidential-secureboot-mini-perm-f14-mco:44 2 2,16 * *...-mco-a:47 3 3,17 * *...-mco-b:50 4 4,18 * *2. Added an interactive debug “pause” gate after cluster readiness
ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.shci-operator/step-registry/openshift-extended/test/longduration/openshift-extended-test-longduration-commands.sh/tmp/continue.DEBUG_WAIT_TIMEOUT(default10800seconds / 3 hours)./tmp/continueis present or when the timeout is reached (allowing manual inspection during rehearsals).