feat(blog): GB300 vs GB200 NVL72 on DSv4-Pro — up to 2.83x throughput/GPU#391
Merged
functionstackx merged 3 commits intoMay 27, 2026
Merged
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 1448436. Configure here.
…/GPU Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1448436 to
f3a04a8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
conc=3072, 28-GPU prefill, 32-GPU decode EP=16, 6,812 tok/s/GPU at 25.9 tok/s/user) that GB200 has no equivalent for in the 22–32 tok/s/user band.latest_benchmarksvia InferenceX MCP (every row in the per-conc tables maps 1:1 to a CSV row from the 2026-05-22 run, GHA 26306422380)..claude/skills/write-inferencex-blog/iso_interactivity.pyhelper so the numbers match the live dashboard chart.Test plan
/blog/gb300-nvl72-vs-gb200-nvl72-dsv4-pro-vllm-fp4on the Vercel preview and verify bothbenchmark-{light,dark}.pngandspecs-radar-{light,dark}.pngrender/blog,/feed.xml,/llms.txt, and/sitemap.xml🤖 Generated with Claude Code
Note
Low Risk
Editorial MDX and a documentation link change only; no runtime, auth, or data-path changes. Review should focus on numeric claims and external links, not production risk.
Overview
Adds a new InferenceX blog post comparing GB300 NVL72 vs GB200 NVL72 on DeepSeek-V4-Pro (Dynamo+vLLM, FP4 8K/1K, disaggregated), with headline 2.83× throughput/GPU and 2.31× perf/$ at 27 tok/s/user, framed as HBM headroom unlocking wider prefill/decode recipes in the 22–32 tok/s/user band. The post includes dashboard CTAs, benchmark/specs figure paths, per-concurrency tables, an iso-interactivity table, acknowledgments, and FAQ JSON-LD.
Also updates the write-inferencex-blog skill so the canonical SemiAnalysis AI Cloud TCO Model link points to
https://semianalysis.com/ai-cloud-tco-model/instead of the newsletter URL.Reviewed by Cursor Bugbot for commit f3a04a8. Bugbot is set up for automated code reviews on this repo. Configure here.