Skip to content

ci(mesh): add Atomesh accuracy and benchmark workflows#1159

Merged
valarLip merged 9 commits into
mainfrom
zwan/feat-mesh-ci
Jun 15, 2026
Merged

ci(mesh): add Atomesh accuracy and benchmark workflows#1159
valarLip merged 9 commits into
mainfrom
zwan/feat-mesh-ci

Conversation

@wanzhenchn

@wanzhenchn wanzhenchn commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Motivation

This PR adds dedicated Atomesh CI coverage for mesh correctness, mocker-based PD routing performance, and dashboard visibility.

Atomesh has different validation needs from the full ATOM model CI. The goal is to keep mesh CI focused, lightweight, and actionable:

  • Validate Atomesh standalone entrypoint integration through a representative accuracy subset.
  • Continuously benchmark PD separation routing behavior with mocker-based workloads.
  • Publish benchmark history to a dedicated dashboard for performance tracking.
  • Avoid running unrelated ATOM, vLLM, and SGLang CI when a PR only changes mesh-specific code.

Summary

Atomesh accuracy validation

  • Adds Atomesh standalone accuracy validation using USE_ATOMESH_ENTRYPOINTS=1.
  • Keeps a small representative model subset with local levels to the Atomesh workflow, so shared models_accuracy.json ownership remains unchanged.
    • pr: Meta-Llama-3-8B-Instruct
    • main: DeepSeek-R1-0528
    • nightly: DeepSeek-V4-Pro MTP, gpt-oss-120b
  • The workflow expands coverage by trigger type:
    • Pull request runs only pr models for fast entrypoint validation.
    • Push to main runs pr + main models.
    • Scheduled and manual runs execute pr + main + nightly models.
  • Publishes Atomesh accuracy results into the existing benchmark dashboard data path for dashboard reuse.

This keeps Atomesh standalone validation focused on USE_ATOMESH_ENTRYPOINTS=1 integration coverage, while the full ATOM accuracy workflows continue to own the complete model matrix.

Atomesh mocker benchmark

  • Adds a mocker benchmark workflow focused on PD separation routing scenarios.
  • Runs representative topology coverage:
    • 1P1D
    • 2P1D
    • 3P1D
  • Runs each topology across consumer concurrency levels:
    • 1, 2, 4, 8, 16
  • Records request throughput, latency metrics, failed requests, request count, duration, and CI run metadata.
  • Aggregates per-cell results into a markdown summary and benchmark-action compatible JSON.

Mocker benchmark dashboard

Live: https://rocm.github.io/ATOM/atomesh-mocker-dashboard/

image image
  • Adds a custom Atomesh mocker benchmark dashboard.
  • Provides Performance, Accuracy, and Trends views.
  • Performance view includes:
    • Request throughput chart
    • Combined latency chart for avg / p99 / p999 latency
    • Detailed performance table with configuration, concurrency, request count, commit, and CI run link
    • Download JSON support
  • Accuracy view reuses Atomesh accuracy dashboard data and displays GSM8K validation results.
  • Trends view shows historical benchmark changes by configuration and concurrency.
  • Publishes the custom dashboard HTML together with benchmark data to gh-pages.

CI routing optimization

  • Adds mesh-only PR skip rules for unrelated CI workflows.
  • Mesh-specific changes no longer trigger:
    • ATOM Test
    • ATOM vLLM Test
    • ATOM SGLang Test
  • Atomesh-specific workflows still run for mesh source, Atomesh workflow, Atomesh script, and Atomesh dashboard changes.
  • Mixed PRs that touch mesh plus ATOM/vLLM/SGLang paths still preserve full relevant CI coverage.

@wanzhenchn wanzhenchn force-pushed the zwan/feat-mesh-ci branch 10 times, most recently from cbac91e to f19a129 Compare June 12, 2026 07:15
@wanzhenchn wanzhenchn requested a review from valarLip June 12, 2026 07:32
wanzhenchn and others added 9 commits June 15, 2026 03:27
- Validate standalone-mode accuracy via Atomesh entrypoints.
- Mocker benchmark to PD routing scenarios with topology and consumer concurrency matrix.
- Add a custom dashboard for Atomesh mocker benchmark results.
- Show throughput, latency, detailed performance data, commit links, and CI run links.
- Align the benchmark matrix with 1P1D, 2P1D, and 3P1D topologies across consumer concurrency levels.
@valarLip valarLip merged commit 18b17f4 into main Jun 15, 2026
52 of 66 checks passed
@valarLip valarLip deleted the zwan/feat-mesh-ci branch June 15, 2026 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants