You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AdCP storyboards and the compliance track exclusively test sell-side agents: sales-*, audience-sync, brand-rights, governance-*, measurement-verification, etc. (static/compliance/source/specialisms/). There is no storyboard coverage, no fixture harness, and no compliance track for the buyer/orchestrator half of the protocol.
Consequences:
Buyer and orchestrator agents cannot be certified against a shared behavioral bar.
Implementers have no reference suite for "does my orchestrator negotiate, reconcile, and recover correctly."
Protocol decisions (idempotency keys, webhook reconciliation, TERMS_REJECTED, pacing) are enforced on senders but untested on receivers/initiators.
We ship 3.0 with asymmetric test coverage on both sides of the wire.
This RFC proposes an epic for 3.1 to close the gap.
Storyboards today (storyboard-schema.yaml + SingleAgentClient runner) test inbound agent responses against a scripted caller: buyer says X, agent must respond Y.
Buyer/orchestrator storyboards invert the harness: scripted sell-side, agent-under-test is the buyer. Assertions are about outbound judgment and state management, not response schemas.
Proposed shape
1. Harness additions
Fixture publisher agent — a reference sell-side implementation (HTTP + MCP + A2A) that replays canned responses keyed by scenario ID. Supports scripted edge cases: slow response, TERMS_REJECTED, webhook drop, stale digest, auth expiry, pacing divergence.
Buyer storyboard schema — probably an extension of storyboard-schema.yaml with a new role: "buyer" track. Steps describe publisher fixture state + expected buyer-agent action/decision, not request/response pairs.
Behavioral validators — beyond schema checks: did the agent retry, did it challenge, did it stop, did it reconcile. Some will be judgment-based (LLM-as-judge) and gated as SHOULD not MUST initially.
Judgment assertions: how much do we rely on LLM-as-judge vs. deterministic checks? Start deterministic, layer judgment for things like "did the challenge question make sense."
Fixture publisher scope: one reference agent or one-per-specialism? Single agent with scenario-keyed responses is simpler; per-specialism is easier to author.
Relationship to training agent (see embedded training agent work): the fixture publisher could double as the certification training foil.
Milestone sizing: full epic is likely bigger than 3.1. Minimum viable 3.1 deliverable: schema + fixture publisher + 2 specialisms (discovery, activation) + compliance track stub.
Not in scope
Changes to sell-side storyboards.
Buyer-side schema changes (this is testing infrastructure, not protocol).
Specific LLM judge prompts (separate deliverable).
Problem
AdCP storyboards and the compliance track exclusively test sell-side agents:
sales-*,audience-sync,brand-rights,governance-*,measurement-verification, etc. (static/compliance/source/specialisms/). There is no storyboard coverage, no fixture harness, and no compliance track for the buyer/orchestrator half of the protocol.Consequences:
This RFC proposes an epic for 3.1 to close the gap.
Context
storyboard-schema.yaml+SingleAgentClientrunner) test inbound agent responses against a scripted caller: buyer says X, agent must respond Y.Proposed shape
1. Harness additions
TERMS_REJECTED, webhook drop, stale digest, auth expiry, pacing divergence.storyboard-schema.yamlwith a newrole: "buyer"track. Steps describe publisher fixture state + expected buyer-agent action/decision, not request/response pairs.2. Specialism set (initial)
Rough scenario spine:
buyer-discoverybuyer-planningbuyer-negotiationTERMS_REJECTEDhandling, makegood acceptancebuyer-activationbuyer-monitoringbuyer-recoveryorchestrator-multi-agent3. Compliance track
buyer-orchestratorcompliance track (parallel to sell-side tracks).Open questions
Not in scope
Related
@adcp/clientstoryboard runner,static/compliance/source/