Skip to content

Conversation

@jcpowermac
Copy link
Contributor

…tform config is set

When users specify zones in platform.vsphere.defaultMachinePlatform.zones and also configure controlPlane.platform.vsphere or compute.platform.vsphere with CPU/memory settings, the zones from defaultMachinePlatform were being ignored and all nodes were landing in random zones based on their physical vCenter location.

Root cause: The validation code in ValidateMachinePool was automatically populating zones with ALL failure domains when zones were empty. This happened BEFORE the Set() merging logic in master.go and worker.go, causing the zones from defaultMachinePlatform to be overwritten.

Fix: Remove automatic zone population from validation code. Validation now only validates zones but does not populate them. Zone population happens during machine generation AFTER merging with defaultMachinePlatform, ensuring zones from defaultMachinePlatform are properly respected.

Changes:

  • pkg/types/vsphere/validation/machinepool.go: Remove auto-population logic
  • pkg/types/vsphere/validation/machinepool_test.go: Update test expectations
  • pkg/types/vsphere/machinepool_test.go: Add tests for Set() method behavior

🤖 Generated with Claude Code

…tform config is set

When users specify zones in platform.vsphere.defaultMachinePlatform.zones
and also configure controlPlane.platform.vsphere or compute.platform.vsphere
with CPU/memory settings, the zones from defaultMachinePlatform were being
ignored and all nodes were landing in random zones based on their physical
vCenter location.

Root cause: The validation code in ValidateMachinePool was automatically
populating zones with ALL failure domains when zones were empty. This
happened BEFORE the Set() merging logic in master.go and worker.go, causing
the zones from defaultMachinePlatform to be overwritten.

Fix: Remove automatic zone population from validation code. Validation now
only validates zones but does not populate them. Zone population happens
during machine generation AFTER merging with defaultMachinePlatform, ensuring
zones from defaultMachinePlatform are properly respected.

Changes:
- pkg/types/vsphere/validation/machinepool.go: Remove auto-population logic
- pkg/types/vsphere/validation/machinepool_test.go: Update test expectations
- pkg/types/vsphere/machinepool_test.go: Add tests for Set() method behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Nov 14, 2025
@openshift-ci-robot
Copy link
Contributor

@jcpowermac: This pull request references Jira Issue OCPBUGS-62209, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

…tform config is set

When users specify zones in platform.vsphere.defaultMachinePlatform.zones and also configure controlPlane.platform.vsphere or compute.platform.vsphere with CPU/memory settings, the zones from defaultMachinePlatform were being ignored and all nodes were landing in random zones based on their physical vCenter location.

Root cause: The validation code in ValidateMachinePool was automatically populating zones with ALL failure domains when zones were empty. This happened BEFORE the Set() merging logic in master.go and worker.go, causing the zones from defaultMachinePlatform to be overwritten.

Fix: Remove automatic zone population from validation code. Validation now only validates zones but does not populate them. Zone population happens during machine generation AFTER merging with defaultMachinePlatform, ensuring zones from defaultMachinePlatform are properly respected.

Changes:

  • pkg/types/vsphere/validation/machinepool.go: Remove auto-population logic
  • pkg/types/vsphere/validation/machinepool_test.go: Update test expectations
  • pkg/types/vsphere/machinepool_test.go: Add tests for Set() method behavior

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 14, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign rvanderp3 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jcpowermac
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Nov 14, 2025
@openshift-ci-robot
Copy link
Contributor

@jcpowermac: This pull request references Jira Issue OCPBUGS-62209, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sgaoshang

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Nov 14, 2025
@openshift-ci openshift-ci bot requested a review from sgaoshang November 14, 2025 19:58
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 15, 2025

@jcpowermac: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn 8439b6f link true /test e2e-aws-ovn
ci/prow/okd-scos-e2e-aws-ovn 8439b6f link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-vsphere-ovn 8439b6f link true /test e2e-vsphere-ovn
ci/prow/e2e-vsphere-ovn-zones 8439b6f link false /test e2e-vsphere-ovn-zones
ci/prow/e2e-vsphere-multi-vcenter-ovn 8439b6f link false /test e2e-vsphere-multi-vcenter-ovn
ci/prow/e2e-vsphere-ovn-disk-setup-techpreview 8439b6f link false /test e2e-vsphere-ovn-disk-setup-techpreview
ci/prow/e2e-vsphere-ovn-techpreview 8439b6f link false /test e2e-vsphere-ovn-techpreview
ci/prow/golint 8439b6f link true /test golint
ci/prow/e2e-vsphere-ovn-hybrid-env 8439b6f link false /test e2e-vsphere-ovn-hybrid-env

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Comment on lines +10 to +11
// are preserved when pool-specific platform config is set without zones.
// This reproduces the bug reported in OCPBUGS-62209.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ I think test cases in this file are testing the Set func, right?

func (p *MachinePool) Set(required *MachinePool) {

I am just unsure it is "truly" reproduce OCPBUGS-62209, where zones are defaulted in validation (where it should not be). I guess we can adjust the comments (golint is also failing here 😁)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants