Skip to content

fix(e2e): skip NetworkIsolated tests in GPU E2E pipeline#8267

Closed
ganeshkumarashok wants to merge 1 commit intomainfrom
aganeshkumar/fix-gpu-e2e-skip-network-isolated
Closed

fix(e2e): skip NetworkIsolated tests in GPU E2E pipeline#8267
ganeshkumarashok wants to merge 1 commit intomainfrom
aganeshkumar/fix-gpu-e2e-skip-network-isolated

Conversation

@ganeshkumarashok
Copy link
Copy Markdown
Contributor

Summary

  • The GPU E2E pipeline (e2e-gpu.yaml) runs all tests tagged gpu=true but doesn't have network-isolated cluster infrastructure (private ACR, NSG outbound rules, etc.)
  • Tests with both GPU=true and NetworkIsolated=true (e.g., Test_Ubuntu2404_FullyManagedGPU_NetworkIsolated from test: add E2E test for fully managed GPU + network isolation #8253) fail because the pipeline can't provision network-isolated clusters
  • Adds NetworkIsolated=true to TAGS_TO_SKIP in the GPU pipeline so these tests are skipped
  • The regular E2E pipeline (e2e.yaml) already skips gpu=true tests, so this doesn't change behavior there
  • Network-isolated GPU tests can still be run locally or via a future dedicated pipeline

Test plan

  • GPU E2E pipeline no longer fails on NetworkIsolated tests
  • Existing GPU tests continue to run (they don't have NetworkIsolated: true)

The GPU E2E pipeline runs all tests with `gpu=true` but doesn't have
network-isolated cluster infrastructure. Tests tagged with both
`GPU=true` and `NetworkIsolated=true` fail because the pipeline
can't provision network-isolated clusters (private ACR, NSG rules, etc.).

Add `NetworkIsolated=true` to TAGS_TO_SKIP so these tests are skipped
in the GPU pipeline. They can still be run locally or in a dedicated
network-isolated pipeline.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the GPU E2E Azure DevOps pipeline to skip network-isolated scenarios that the GPU pipeline infrastructure can’t currently provision (e.g., missing private ACR / restricted egress setup), preventing predictable CI failures while still running standard GPU-tagged tests.

Changes:

  • Extend TAGS_TO_SKIP in the GPU E2E pipeline to include NetworkIsolated=true.
  • Keep the GPU pipeline scope as TAGS_TO_RUN: gpu=true while excluding Windows and network-isolated scenarios.

@ganeshkumarashok
Copy link
Copy Markdown
Contributor Author

Closing: wrong approach. The test should run in the GPU pipeline, not be skipped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants