Skip to content

feat(mcp-gateway): validate and expose applied policy snapshots#298

Merged
Agent-Hellboy merged 5 commits into
mainfrom
mcp-gateway/policy_snapshot_validation
Jun 12, 2026
Merged

feat(mcp-gateway): validate and expose applied policy snapshots#298
Agent-Hellboy merged 5 commits into
mainfrom
mcp-gateway/policy_snapshot_validation

Conversation

@Agent-Hellboy

Copy link
Copy Markdown
Owner

Add document-level metadata (schema_version, deterministic revision, informational generated_at) to the rendered gateway policy contract, a shared pkg/policy.Validate used by both producer and consumer, and last-known-good snapshot activation in the gateway.

  • pkg/policy: schema version + ComputeRevision/Stamp (SHA-256 over canonical content, excluding revision/generated_at); strict, fail-closed Validate covering schema version, server identity, auth/policy/decision/ trust/side-effect values, unique names, and OAuth issuer requirements.
  • operator: stamp schema+revision in renderGatewayPolicy, validate before replacing the ConfigMap, and avoid reconcile churn by preserving the prior payload when the revision is unchanged.
  • mcp-gateway: validate after decode and before activating a snapshot; invalid updates retain the previous valid policy and record the error; expand the snapshot with revision/loaded-at/ready/last-error; add /ready, /config/status, and /metrics (reload counters, active revision, last success timestamp).
  • docs/runtime.md: explain schema version, revision, and last-known-good.

Closes #295

Add document-level metadata (schema_version, deterministic revision,
informational generated_at) to the rendered gateway policy contract, a
shared pkg/policy.Validate used by both producer and consumer, and
last-known-good snapshot activation in the gateway.

- pkg/policy: schema version + ComputeRevision/Stamp (SHA-256 over
  canonical content, excluding revision/generated_at); strict, fail-closed
  Validate covering schema version, server identity, auth/policy/decision/
  trust/side-effect values, unique names, and OAuth issuer requirements.
- operator: stamp schema+revision in renderGatewayPolicy, validate before
  replacing the ConfigMap, and avoid reconcile churn by preserving the prior
  payload when the revision is unchanged.
- mcp-gateway: validate after decode and before activating a snapshot;
  invalid updates retain the previous valid policy and record the error;
  expand the snapshot with revision/loaded-at/ready/last-error; add /ready,
  /config/status, and /metrics (reload counters, active revision, last
  success timestamp).
- docs/runtime.md: explain schema version, revision, and last-known-good.

Closes #295

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0186c2930e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +90 to +91
// Readiness: OK only after a valid policy snapshot has been activated.
mux.HandleFunc("/ready", srv.handleReady)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wire Kubernetes readiness to /ready

Registering /ready does not affect pod readiness because the operator-generated gateway container still uses a TCP socket probe (internal/operator/deployment.go:532-535). Any serving gateway is therefore marked ready regardless of snapshot.Ready, so Kubernetes and MCPServer.status.gatewayReady cannot detect that no valid policy has been activated. Change the generated readiness probe to an HTTP GET on /ready.

Useful? React with 👍 / 👎.

Comment thread pkg/policy/validate.go
Comment on lines +21 to +23
if _, ok := supportedSchemaVersions[doc.SchemaVersion]; !ok {
return fmt.Errorf("policy: unsupported schema version %q", doc.SchemaVersion)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Verify revision matches policy content

When a file-backed policy has a supported schema but a missing, stale, or arbitrary revision, Validate accepts it without recomputing ComputeRevision. The gateway then activates the changed authorization body while /config/status and mcp_gateway_policy_active_revision_info report the supplied revision, defeating the new revision contract and potentially making a policy change appear unapplied. Require a nonempty revision and compare it with the computed digest before activation.

Useful? React with 👍 / 👎.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a robust "last-known-good" policy snapshot mechanism for the gateway. It introduces deterministic policy revision hashing, strict policy validation shared between the operator and gateway, and several observability endpoints including /ready, /config/status, and /metrics. Feedback suggests enhancing the policy validation logic to reject duplicate tool rules within a single grant to prevent conflicting policy evaluations.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread pkg/policy/validate.go
Comment on lines +114 to +124
for j, rule := range grant.ToolRules {
if strings.TrimSpace(string(rule.Name)) == "" {
return fmt.Errorf("policy: grant %q tool_rules[%d] name is required", grant.Name, j)
}
if !validDecision(rule.Decision) {
return fmt.Errorf("policy: grant %q tool rule %q has invalid decision %q", grant.Name, rule.Name, rule.Decision)
}
if !validTrust(rule.RequiredTrust) {
return fmt.Errorf("policy: grant %q tool rule %q has invalid required_trust %q", grant.Name, rule.Name, rule.RequiredTrust)
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent ambiguous or conflicting policy evaluations, consider validating that each tool rule within a single grant is unique. Adding a duplicate check for rule.Name ensures that a grant does not define multiple conflicting rules for the same tool.

		seenRules := make(map[ToolName]struct{}, len(grant.ToolRules))
		for j, rule := range grant.ToolRules {
			if strings.TrimSpace(string(rule.Name)) == "" {
				return fmt.Errorf("policy: grant %q tool_rules[%d] name is required", grant.Name, j)
			}
			if _, dup := seenRules[rule.Name]; dup {
				return fmt.Errorf("policy: grant %q has duplicate tool rule %q", grant.Name, rule.Name)
			}
			seenRules[rule.Name] = struct{}{}
			if !validDecision(rule.Decision) {
				return fmt.Errorf("policy: grant %q tool rule %q has invalid decision %q", grant.Name, rule.Name, rule.Decision)
			}
			if !validTrust(rule.RequiredTrust) {
				return fmt.Errorf("policy: grant %q tool rule %q has invalid required_trust %q", grant.Name, rule.Name, rule.RequiredTrust)
			}
		}

Agent-Hellboy and others added 4 commits June 11, 2026 00:35
CVE-2026-42504 affects the Go 1.26.3 stdlib; upgrading the multi-stage
builder to 1.26.4 (latest stable patch) removes the unfixed CVE so the
Trivy operator-image scan no longer fails.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- operator: change gateway readiness probe from TCP socket to HTTP GET
  /ready so Kubernetes pod readiness gates on actual policy activation
  (internal/operator/deployment.go)

- policy: require non-empty revision and verify it matches the computed
  SHA-256 digest before activation; prevents stale or arbitrary revision
  fields from masking undetected policy body changes (pkg/policy/validate.go)

- policy: detect duplicate tool rules within a single grant; a grant
  with two rules for the same tool produced ambiguous evaluations
  (pkg/policy/validate.go)

- tests: restamp documents after body mutations so revision reflects the
  mutated body and each rejection test fires the intended error path;
  add revision-missing and revision-mismatch cases (pkg/policy/validate_test.go)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The revision digest covers the document exactly as the operator
rendered it. Validating after applyPolicyDefaults mutated the document
(auth headers, server identity, policy mode) recomputed a different
digest and rejected every operator-stamped policy with
revision-mismatch. Validate file-backed documents before defaulting;
gateway-generated default documents are stamped and validated after
defaulting since their revision covers final content.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Combine last-known-good policy snapshot validation with main's scoped
gateway request metrics. Merge conflicts in mcp-gateway main, metrics,
and policy_cache retain /ready readiness, revision tracking, and both
metric families. Update policy reload test to use stamped policy files.

Co-authored-by: Cursor <cursoragent@cursor.com>
@Agent-Hellboy Agent-Hellboy merged commit c99e2c8 into main Jun 12, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(mcp-gateway): validate and expose applied policy snapshots

1 participant