fix(e2e): add DAG dependencies to prevent route table race condition#8268
fix(e2e): add DAG dependencies to prevent route table race condition#8268
Conversation
addFirewallRules and collectGarbageVMSS both trigger Azure subnet/route table mutations. Running them concurrently (introduced in #8149) can cause the AKS-managed pod route table to be detached from the subnet: 1. addFirewallRules associates abe2e-fw-rt with aks-subnet (displacing the AKS route table) 2. collectGarbageVMSS deletes stale VMSS, triggering cloud-controller route reconciliation that races with the subnet association 3. getOrCreateBastion creates AzureBastionSubnet (VNet-level mutation) The race leaves aks-agentpool-*-routetable with 0 subnet associations, so CCM-programmed pod routes are unreachable. Nodes stay stuck on NetworkUnavailable: 'Waiting for cloud routes' forever. Fix: add DAG edges so that: - addFirewallRules / addNetworkIsolatedSettings wait for bastion (serialize VNet subnet mutations) - collectGarbageVMSS waits for network setup (prevent VMSS deletion from racing with route table association) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR addresses an E2E infrastructure regression introduced by the concurrent DAG-based execution in prepareCluster by adding dependency edges that serialize Azure VNet/subnet-mutating operations, preventing route table association races that can strand AKS pod route tables.
Changes:
- Make
addFirewallRulesandaddNetworkIsolatedSettingsdepend on thebastiontask to avoid concurrent subnet/VNet mutations. - Make
collectGarbageVMSSdepend on completion of network setup to prevent VMSS deletion (and ensuing route reconciliation) from racing with subnet route table association.
Replace broad 'e2e/' path trigger with 'e2e/scenario_win_test.go' only. Previously, any change to shared E2E infrastructure (validators.go, cluster.go, dag/, etc.) would trigger a full Windows VHD build (~1hr) even though standalone e2e-windows.yaml already covers E2E changes. This matches the Linux VHD build pipeline pattern which does not include e2e/ in its trigger paths at all. Also removes the now-unnecessary 'e2e/scenario_test.go' exclude entry since e2e/ is no longer broadly included. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
timmy-wright
left a comment
There was a problem hiding this comment.
Not sure this is sensible - see comment.
…E files" This reverts commit b460a47.
| // networkSetup associates a route table with aks-subnet (firewall or | ||
| // network-isolated NSG). It must run after bastion (both mutate the | ||
| // VNet) and before collectGarbageVMSS (VMSS deletion triggers AKS | ||
| // cloud-controller route reconciliation that can race with the subnet | ||
| // association and leave the AKS pod route table detached). |
There was a problem hiding this comment.
The new comment says networkSetup associates a route table with aks-subnet (firewall or network-isolated NSG), but addNetworkIsolatedSettings actually attaches an NSG (no route table). Consider rewording to avoid implying a route table association in the network-isolated path (e.g., describe this as subnet/VNet mutations: firewall route-table association OR network-isolated NSG update).
| // networkSetup associates a route table with aks-subnet (firewall or | |
| // network-isolated NSG). It must run after bastion (both mutate the | |
| // VNet) and before collectGarbageVMSS (VMSS deletion triggers AKS | |
| // cloud-controller route reconciliation that can race with the subnet | |
| // association and leave the AKS pod route table detached). | |
| // networkSetup performs subnet/VNet mutations for the cluster network: | |
| // firewall setup associates the AKS subnet route table, while | |
| // network-isolated setup updates the subnet/NSG settings. It must run | |
| // after bastion (both mutate the VNet) and before collectGarbageVMSS | |
| // (VMSS deletion triggers AKS cloud-controller route reconciliation | |
| // that can race with the firewall subnet association and leave the AKS | |
| // pod route table detached). |
Problem
PR #8149 refactored
prepareClusterto use concurrent DAG execution. This introduced a race condition between three operations that mutate the VNet/subnet:getOrCreateBastion— creates AzureBastionSubnetaddFirewallRules— creates AzureFirewallSubnet + associatesabe2e-fw-rtwithaks-subnetcollectGarbageVMSS— deletes old VMSS → triggers AKS cloud-controller route reconciliationRunning them concurrently causes the AKS pod route table (
aks-agentpool-*-routetable) to be detached from the subnet. The firewall route table (abe2e-fw-rt) wins the association, but has no pod routes. Result:NetworkUnavailable: Waiting for cloud routes— never clearedcni plugin not initialized— CNI config never writtenEvidence from Azure
aks-agentpool-13663576-routetableabe2e-fw-rtaks-subnet)Fix
Add DAG dependency edges to restore the sequential ordering for VNet-mutating operations:
addFirewallRules/addNetworkIsolatedSettings→ depends onbastion(serialize subnet writes)collectGarbageVMSS→ depends on network setup (prevent VMSS deletion from racing with route table association)All other tasks remain fully concurrent (kube client, identity, maintenance, subnet read, extract params).
DAG Before vs After
Fixes the E2E infrastructure failures observed on PR #8230 and the scriptless E2E gate.