fix(vmcp): /status uses live health monitor state#4135
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes vMCP /status to report backend health using the live health monitor state (matching /api/backends/health) instead of the static registry snapshot, addressing inconsistent health reporting (Fixes #4103).
Changes:
- Update
buildStatusResponse()to prefer health monitor runtime state when available. - Add unit tests ensuring
/statusreflects healthy/unhealthy transitions from the monitor and falls back to the registry when monitoring is disabled. - Add E2E coverage comparing
/statusand/api/backends/healthside-by-side in the circuit breaker lifecycle suite.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
pkg/vmcp/server/status.go |
Prefer live monitor health states when building /status output. |
pkg/vmcp/server/status_test.go |
Add unit tests covering monitor-driven /status behavior and fallback behavior. |
test/e2e/thv-operator/virtualmcp/helpers.go |
Add HTTP helpers + response structs for /status and /api/backends/health. |
test/e2e/thv-operator/virtualmcp/virtualmcp_circuit_breaker_test.go |
Add E2E assertion that both endpoints agree on backend health. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
test/e2e/thv-operator/virtualmcp/virtualmcp_circuit_breaker_test.go
Outdated
Show resolved
Hide resolved
test/e2e/thv-operator/virtualmcp/virtualmcp_circuit_breaker_test.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e699150f8e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
test/e2e/thv-operator/virtualmcp/virtualmcp_circuit_breaker_test.go
Outdated
Show resolved
Hide resolved
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4135 +/- ##
==========================================
+ Coverage 68.13% 68.87% +0.74%
==========================================
Files 461 461
Lines 46704 46571 -133
==========================================
+ Hits 31821 32078 +257
+ Misses 12105 11988 -117
+ Partials 2778 2505 -273 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The /status endpoint was reading backend health from the static registry snapshot (set at discovery time and never updated), while /api/backends/health correctly read from the live health monitor. This caused the two endpoints to report inconsistent state for the same backend (issue #4103). Fix buildStatusResponse() to call GetAllBackendHealthStates() and prefer the monitor's runtime health over the registry's initial value. When health monitoring is disabled the registry value is used as before, preserving backwards compatibility. Add unit tests that assert /status reflects live monitor state for both healthy and unhealthy transitions, and an e2e It block in the circuit breaker lifecycle suite that compares both endpoints side-by-side once the unstable backend's circuit breaker opens. Closes: #4103
Summary
The /status endpoint was reading backend health from the static registry snapshot (set at discovery time and never updated), while /api/backends/health correctly read from the live health monitor. This caused the two endpoints to report inconsistent state for the same backend (issue #4103).
Fix buildStatusResponse() to call GetAllBackendHealthStates() and prefer the monitor's runtime health over the registry's initial value. When health monitoring is disabled the registry value is used as before, preserving backwards compatibility.
Add unit tests that assert /status reflects live monitor state for both healthy and unhealthy transitions, and an e2e It block in the circuit breaker lifecycle suite that compares both endpoints side-by-side once the unstable backend's circuit breaker opens.
Fixes #4103
Type of change
Test plan
task test)task test-e2e)task lint-fix)Changes
Does this introduce a user-facing change?
Special notes for reviewers