Skip to content

feat(tmp): Ed25519 request signing + HPKE TMPX exposure tokens#114

Merged
ohalushchak-exadel merged 13 commits into
mainfrom
ohalushchak-exadel/adcp-ed25519-auth
May 12, 2026
Merged

feat(tmp): Ed25519 request signing + HPKE TMPX exposure tokens#114
ohalushchak-exadel merged 13 commits into
mainfrom
ohalushchak-exadel/adcp-ed25519-auth

Conversation

@ohalushchak-exadel
Copy link
Copy Markdown
Collaborator

@ohalushchak-exadel ohalushchak-exadel commented May 8, 2026

Summary

Two TMP spec gaps closed end-to-end in the same branch, plus the v3.0.7 schema bump and the security/code-review hardening pass:

  1. Request authentication (spec §Request Authentication) — Ed25519 signing on every router→provider fan-out (X-AdCP-Signature / X-AdCP-Key-Id), per-provider binding, daily-epoch replay window. The previous tmproto/signing.go predated the current spec (wrong field set, no provider_endpoint_url binding, no JCS) and the router/reference-agents sent and accepted unsigned requests. The misleading "TMP no longer defines request-level signing" comment in router/router.go is removed.
  2. TMPX exposure tokens (spec §TMPX Exposure Tokens) — HPKE mode_base with the spec-fixed cipher suite (DHKEM(X25519, HKDF-SHA256) / HKDF-SHA256 / ChaCha20-Poly1305). Until now IdentityMatchResponse.Tmpx was defined in the schema but no agent in the repo populated it.
  3. AdCP schema bundle bumped to v3.0.7 (Sigstore-verified) — types_gen.go is the version-header bump; the wire-level change is in call-adcp-agent skill error-envelope semantics (property_name instead of field inside discriminator, hint / allowedValues reclassified as SDK-side enrichment).

What changed

Protocol layer (tmproto/)

  • signing.go — newline-joined signing input for context match; hex(SHA-256(JCS(canonical_object))) for identity match; per-provider URL binding; daily-epoch replay window (current + previous); revocation via revoked_at. Signer.PrivateKey is unexported — derived material flows only through PublicJWK().
  • jcs.go — stdlib-only RFC 8785 (JSON Canonicalization Scheme). Object keys sort by UTF-16 code units. Non-integer floats are rejected (Go's strconv.FormatFloat 'g' is not exact ECMA-262; loud failure prevents future drift).
  • verify_middleware.goVerifyContextMatchHandler / VerifyIdentityMatchHandler for reference providers. Decoder runs with DisallowUnknownFields so a future-protocol field can't silently disappear from the recomputed signing input.
  • keystore_remote.go — polls the router's /registry/snapshot for signing-key discovery, 5-min TTL. Hardened: https:// required by default (opt-in AllowInsecureScheme for dev), cross-origin redirects denied, body capped at 1 MB, transient empty snapshots retain cached keys, cross-property kid collisions warn-and-keep-first. Lifecycle is Refresh + Run (no double-Start).
  • tmpx.go — HPKE mode_base on stdlib (crypto/ecdh, crypto/hkdf, crypto/sha256) + golang.org/x/crypto/chacha20poly1305. No third-party HPKE framework. Binary-plaintext encoder (16-byte header + entries), TMPX type-ID registry, kid.base64url(enc||ct) wire format. labeledExpand validates length ≤ 0xffff.

Router (router/)

  • WithTMPSigner option. Signature headers attached on every fan-out, identity-match re-signed per provider.
  • New signing.go(placement_id, provider_endpoint_url, epoch) cache for context-match signatures.
  • RegistryProperty.SigningKeys, AttachSigningKey, kid-indexed LookupKey with first-seen-wins on cross-property kid collisions.
  • HandleContextMatch calls Artifact.StripAccess() before fan-out (spec MUST — was a TODO before this PR).
  • HandleIdentityMatch surfaces json.Marshal failure as 500 with request_id instead of fanning out a stale body.
  • Fixed misleading comment at lines 73-74.

Wiring

  • cmd/router/main.go — fail-closed unless TMP_ROUTER_SIGNING_DISABLED=true. Attaches public JWK to authorized property RIDs in the registry so providers can fetch it via /registry/snapshot.
  • Reference identity-agent: --registry-url, --allow-unsigned (default off — verification is required), --own-endpoint-url. TMPX flags --tmpx-kid, --tmpx-pubkey-path, --tmpx-country plus required ack TMP_IDENTITY_TMPX_REFERENCE_STUB_ACK=1 (the SHA-512 binary-token stub is non-interoperable with real buyer masters and operators must acknowledge before the agent will start).
  • Reference context-agent: same flag set, same default-on signature verification.
  • Both agents honor flag > env > default precedence (was env-can-set-but-not-clear before).
  • RemoteKeyStore.Run is driven by a process-lifetime context; previous double-Start race fixed.

Docs

  • docs/network-surface.md:
    • Ed25519 section: X-AdCP-* header names, signed-input shapes, JCS for identity, per-provider binding, revocation grace (~24h via the previous-epoch window made explicit).
    • TMPX section: cipher suite, wire format, plaintext layout, env vars, the reference-impl stub callout.
    • Crypto agility note: one suite at a time; upgrade requires editing constants and dispatching by kid prefix or JWK fields.
    • Keystore hardening (HTTPS-default, redirect denial, body cap, kid-collision policy).

Decisions

  • Replaced tmproto/signing.go API rather than versioning. Only callers were the context-agent benchmarks; updated in this PR.
  • Hand-rolled HPKE rather than pull a framework. Spec fixes one cipher suite; the implementation is ~250 lines, stdlib-only beyond chacha20poly1305. Validated against the RFC 9180 §A.3 test vector byte-exact at every KDF stage (enc, shared_secret, secret, key, base_nonce, ciphertext).
  • Registry-based key distribution per spec direction. Router publishes its public JWK on the property records it's authorized to sign for; agents fetch via the existing /registry/snapshot endpoint with a 5-minute TTL.
  • Reference agents default to fail-closed on signature verification. --allow-unsigned is the explicit opt-out for migration windows. The TEE-bound reference impl shouldn't normalize "no auth" as the path of least resistance.
  • TMPX SHA-512 stub gated behind TMP_IDENTITY_TMPX_REFERENCE_STUB_ACK=1. Real buyer deployments decode UID2 / RampID / MAID per the source graph's encoding; the stub produces tokens no buyer master can decode. Acknowledgment env var prevents accidental production wiring with zero buyer match-rate.
  • No sample-rate verification — verifier runs at 100% in v1. Spec says SHOULD; revisit if perf demands.

Test plan

  • go test ./... clean across tmproto, router, cmd/router, reference/context-agent, reference/identity-agent, e2e, adcp, bench, targeting/*.
  • RFC 9180 §A.3 vector validated byte-exact at every KDF stage.
  • HPKE Seal/Open roundtrip across many random plaintexts.
  • Signing roundtrip: signer → middleware verifier across context match and identity match.
  • Wrong provider_endpoint_url rejected (per-provider binding).
  • Stale epoch rejected (>1 day).
  • Revoked key rejected.
  • Identity dedup + sort produce deterministic signing inputs regardless of input order.
  • Identity-match signatures differ across providers (per-provider re-signing).
  • Context-match signature cache reuses across requests in the same epoch.
  • TMPX plaintext header layout (version, ts, country, nonce, count) verified at the byte level.
  • TMPX type-ID registry sizes (32 / 48 / 16 bytes) match spec.
  • Unknown TMPX type IDs rejected at encode time; unmappable uid_type values dropped at the agent (forward-compat).
  • Artifact.StripAccess runs before context-match fan-out — bearer tokens never reach providers (test asserts on the forwarded bytes).
  • RemoteKeyStore rejects non-HTTPS URLs by default, denies cross-origin redirects, retains cached keys on transient empty snapshots.
  • Cross-property kid collisions warn and keep first-seen entry in both the registry index and the remote keystore parser.
  • Run returns context.Canceled when its context is canceled (no goroutine leak).
  • JCS rejects non-integer floats (test asserts on 1.5); accepts integer floats.
  • Lint clean across tmproto, router, cmd/router, reference/{identity,context}-agent, e2e.

🤖 Generated with Claude Code

@ohalushchak-exadel ohalushchak-exadel marked this pull request as draft May 8, 2026 12:33
@ohalushchak-exadel ohalushchak-exadel changed the title feat(tmp): implement spec-conformant Ed25519 request signing feat(tmp): Ed25519 request signing + HPKE TMPX exposure tokens May 8, 2026
ohalushchak-exadel added a commit that referenced this pull request May 8, 2026
Hardening pass following the in-tree code review of PR #114. No spec or
wire-format changes — every commit-level diff is plumbing, defaults,
or correctness inside the existing surface.

Lifecycle / concurrency
- RemoteKeyStore: split Start into Refresh + Run so the reference agents
  no longer launch two refresh-loop goroutines (one bound to a 10s
  timeout context that fires while the goroutine is still scheduling).
  buildKeyStore in identity-agent and context-agent now does one
  synchronous Refresh and a single Run goroutine driven by a process-
  lifetime context that's deferred-cancelled at main exit.

Network hardening
- RemoteKeyStore: validate URL scheme — https:// by default; http://
  requires AllowInsecureScheme. Reject file://, ftp://, etc.
- Default HTTP client denies cross-origin redirects (the SSRF /
  key-substitution path). Drops snapshot body cap from 10 MB to 1 MB.
- Empty snapshot now retains cached keys rather than wiping the agent
  into 401-everything during a publisher's mid-deploy churn.
- Cross-property kid collision keeps the first-seen entry and warns
  (router/registry.go and the remote keystore parser) so a malicious
  property record can't shadow another tenant's signing-key namespace.

Verifier strictness
- verify_middleware re-parses bodies with DisallowUnknownFields. A
  future-protocol field would otherwise be dropped from the recomputed
  signing input and silently break verification; loud failure is the
  correct posture.
- Dead statusForVerifyError switch removed (every branch returned 401).

Reference-agent posture (TEE-bound impls default to fail-closed)
- identity-agent and context-agent now require signature verification
  by default. --allow-unsigned (or TMP_{IDENTITY,CONTEXT}_ALLOW_UNSIGNED=1)
  is required to opt out; the previous flag --require-signature is
  removed.
- Flag > env > default precedence is now honored (was env-can-set-but-
  not-clear before).
- TMPX generation is gated behind TMP_IDENTITY_TMPX_REFERENCE_STUB_ACK=1
  because the SHA-512 stub for string→binary token decoding is not
  interoperable with any real buyer master. Operators must acknowledge.

Router error paths
- HandleIdentityMatch now surfaces json.Marshal failure as a 500 with
  request_id rather than fanning out a stale body.
- ContextMatch fan-out now calls Artifact.StripAccess() before
  serializing — spec MUST that was previously a TODO. New test asserts
  bearer tokens never reach providers.

JCS / HPKE / signer
- JCS rejects non-integer floats rather than approximating ECMA-262
  Number.toString. TMP signing inputs only carry integers today; loud
  rejection prevents future drift.
- labeledExpand validates length <= 0xffff and drops the gosec
  suppression — validate, don't silence.
- Signer.PrivateKey is unexported. PublicJWK() remains the only
  public path to derived key material.

Docs
- network-surface.md: revocation grace ~24h via the previous-epoch
  window is now explicit. Crypto agility note added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ohalushchak-exadel ohalushchak-exadel marked this pull request as ready for review May 8, 2026 18:58
Comment thread router/signing.go
ohalushchak-exadel added a commit that referenced this pull request May 11, 2026
PR #114 review caught a real bug: contextSignatureCacheKey was
(placement_id, provider_endpoint_url, epoch), but the Ed25519 signing
input also covers sorted package_ids. The spec mandates that
package_ids is constant per placement — under spec-compliant traffic
the cache is correct — but package_ids is publisher-controlled, so
violating that invariant turns into a signature/body mismatch the
provider has to reject, with no obvious upstream cause.

Add packageIDsKey (sorted, comma-joined — same shape the signing
input uses) to the cache key. Now distinct package sets get distinct
cache entries, and two requests differing only in package_id order
share one entry (the signing input sorts them anyway).

Tests assert (a) different package sets yield different signatures
and a sig minted for set A fails verification on set B, (b)
order-independent package sets share a cache entry, (c) the existing
same-input cache reuse and epoch separation still hold.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ohalushchak-exadel and others added 11 commits May 11, 2026 20:00
The TMP spec mandates Ed25519 request authentication on every router→provider
fan-out (`X-AdCP-Signature` / `X-AdCP-Key-Id`), but the router and reference
agents in this repo were sending and accepting unsigned requests. The existing
tmproto/signing.go also predated the current spec — wrong field set, no
provider_endpoint_url binding, no JCS for identity match. This change wires
the spec envelope end-to-end and removes the misleading "TMP no longer
defines request-level signing" comment in router.go.

Highlights:

- `tmproto/signing.go` rewritten to spec: newline-joined input for context
  match, hex(SHA-256(JCS(canonical_object))) for identity match, daily-epoch
  replay window with previous-epoch tolerance, per-provider URL binding,
  revocation honoring via `revoked_at`.
- `tmproto/jcs.go` — small RFC 8785 JSON Canonicalization Scheme
  implementation, stdlib-only (preserves the no-deps invariant).
- `tmproto/verify_middleware.go` — `VerifyContextMatchHandler` /
  `VerifyIdentityMatchHandler` for reference providers. Reads body once,
  parses, verifies, replays body to the inner handler.
- `tmproto/keystore_remote.go` — `RemoteKeyStore` polls the router's
  `/registry/snapshot` for signing keys (5-min TTL per spec recommendation).
- `router/signing.go` — per-provider signing helpers, `(placement_id,
  provider_endpoint_url, epoch)` cache for context-match signatures.
- `router/router.go` — `WithTMPSigner` option, signature headers attached on
  every fan-out, identity-match re-signed per provider for URL binding.
  Misleading comment at lines 73-74 replaced with accurate spec reference.
- `router/registry.go` — `RegistryProperty.SigningKeys`, `AttachSigningKey`,
  and a kid-indexed `LookupKey` so the registry doubles as a `KeyStore`.
- `cmd/router/main.go` — fail-closed when no key configured (unless
  `TMP_ROUTER_SIGNING_DISABLED=true`); attaches public JWK to authorized
  property RIDs in the registry.
- Reference identity-agent and context-agent — `--registry-url`,
  `--require-signature`, `--own-endpoint-url` flags plumbed through the
  middleware. Default is permissive (warns on unsigned) for migration.
- `docs/network-surface.md` — section rewritten to match the spec exactly:
  X-AdCP-* header names, signed-input shapes, JCS for identity, per-provider
  binding, revocation. Env-var table updated.

Test coverage: roundtrip, wrong-endpoint rejection, stale-epoch rejection,
revoked-key rejection, malformed-signature rejection, per-provider binding,
identity dedup/sort, JCS sorted keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the TMPX exposure-token wire format defined in
docs/trusted-match/specification.mdx §"TMPX Exposure Tokens" — until now
the field existed in the schema but no agent in the repo populated it.

tmproto/tmpx.go implements HPKE mode_base for the spec's fixed cipher
suite (DHKEM(X25519, HKDF-SHA256) / HKDF-SHA256 / ChaCha20-Poly1305) on
top of stdlib (crypto/ecdh, crypto/hkdf, crypto/sha256) and
golang.org/x/crypto/chacha20poly1305. The seal + binary plaintext
encoder validate against the RFC 9180 §A.3 test vector
(enc/shared_secret/secret/key/base_nonce/ct all byte-exact).

The reference identity-agent gains --tmpx-kid, --tmpx-pubkey-path, and
--tmpx-country flags. When configured, every identity-match response
with at least one eligible package carries a TMPX token sealed under the
buyer-cluster X25519 public key. The string-to-binary token conversion
is a documented reference stub — production buyers replace it with
type-specific decoding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Split the const block in tmpx.go so the typed TmpxFormatVersion isn't
  hiding an untyped tmpxKidMaxLen alongside it (SA9004).
- Annotate the count-byte conversion with //nolint:gosec — len(entries)
  is bounds-checked to ≤255 immediately above (G115).
- Convert through int instead of byte in the test assertion (G115).
- main.go: ReadFile receives the operator-configured TMPX public key
  path, exactly the contract gosec flags as G304. Annotate.
- tmpx_test.go: UserToken values are test fixtures, not real credentials.
  Shorten the strings and annotate the two G101 sites.
The TMPX HPKE implementation in tmproto/ pulled in
golang.org/x/crypto/chacha20poly1305. Sub-modules need their own
go.sum entries for the transitive dependency — cmd/router,
reference/context-agent, and reference/identity-agent were tidied,
e2e was missed.
Re-ran adcp/schemas/download.sh 3.0.7 (Sigstore-verified) and
generate.py. Drift lint clean.

Wire-level changes are confined to the call-adcp-agent skill: error
envelope renames `field` → `property_name` inside `discriminator`
entries and reclassifies `hint` / `allowedValues` as SDK-side
enrichment rather than wire fields. types_gen.go diff is the version
header only — no Go struct field changes in this release.
ecdh.X25519().GenerateKey calls randutil.MaybeReadByte before its
32-byte read, with ~50% probability consuming a single byte to defeat
callers that depend on a deterministic rand stream. The
fixedKeyReader fixture in TestHPKERFC9180A3Vector held exactly 32
bytes; when MaybeReadByte fired (CI hit, local mac missed), the actual
key read could only fetch 31 from the buffer and 0,nil from beyond,
so io.ReadFull looped forever and the test timed out at 10m.

Refactored hpkeSealBase to take *ecdh.PrivateKey directly. SealTmpx
generates the ephemeral key from rand.Reader before calling. Tests
construct the ephemeral key via NewPrivateKey, sidestepping
MaybeReadByte entirely. The unused fixedKeyReader is removed.
Hardening pass following the in-tree code review of PR #114. No spec or
wire-format changes — every commit-level diff is plumbing, defaults,
or correctness inside the existing surface.

Lifecycle / concurrency
- RemoteKeyStore: split Start into Refresh + Run so the reference agents
  no longer launch two refresh-loop goroutines (one bound to a 10s
  timeout context that fires while the goroutine is still scheduling).
  buildKeyStore in identity-agent and context-agent now does one
  synchronous Refresh and a single Run goroutine driven by a process-
  lifetime context that's deferred-cancelled at main exit.

Network hardening
- RemoteKeyStore: validate URL scheme — https:// by default; http://
  requires AllowInsecureScheme. Reject file://, ftp://, etc.
- Default HTTP client denies cross-origin redirects (the SSRF /
  key-substitution path). Drops snapshot body cap from 10 MB to 1 MB.
- Empty snapshot now retains cached keys rather than wiping the agent
  into 401-everything during a publisher's mid-deploy churn.
- Cross-property kid collision keeps the first-seen entry and warns
  (router/registry.go and the remote keystore parser) so a malicious
  property record can't shadow another tenant's signing-key namespace.

Verifier strictness
- verify_middleware re-parses bodies with DisallowUnknownFields. A
  future-protocol field would otherwise be dropped from the recomputed
  signing input and silently break verification; loud failure is the
  correct posture.
- Dead statusForVerifyError switch removed (every branch returned 401).

Reference-agent posture (TEE-bound impls default to fail-closed)
- identity-agent and context-agent now require signature verification
  by default. --allow-unsigned (or TMP_{IDENTITY,CONTEXT}_ALLOW_UNSIGNED=1)
  is required to opt out; the previous flag --require-signature is
  removed.
- Flag > env > default precedence is now honored (was env-can-set-but-
  not-clear before).
- TMPX generation is gated behind TMP_IDENTITY_TMPX_REFERENCE_STUB_ACK=1
  because the SHA-512 stub for string→binary token decoding is not
  interoperable with any real buyer master. Operators must acknowledge.

Router error paths
- HandleIdentityMatch now surfaces json.Marshal failure as a 500 with
  request_id rather than fanning out a stale body.
- ContextMatch fan-out now calls Artifact.StripAccess() before
  serializing — spec MUST that was previously a TODO. New test asserts
  bearer tokens never reach providers.

JCS / HPKE / signer
- JCS rejects non-integer floats rather than approximating ECMA-262
  Number.toString. TMP signing inputs only carry integers today; loud
  rejection prevents future drift.
- labeledExpand validates length <= 0xffff and drops the gosec
  suppression — validate, don't silence.
- Signer.PrivateKey is unexported. PublicJWK() remains the only
  public path to derived key material.

Docs
- network-surface.md: revocation grace ~24h via the previous-epoch
  window is now explicit. Crypto agility note added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #114 review caught a real bug: contextSignatureCacheKey was
(placement_id, provider_endpoint_url, epoch), but the Ed25519 signing
input also covers sorted package_ids. The spec mandates that
package_ids is constant per placement — under spec-compliant traffic
the cache is correct — but package_ids is publisher-controlled, so
violating that invariant turns into a signature/body mismatch the
provider has to reject, with no obvious upstream cause.

Add packageIDsKey (sorted, comma-joined — same shape the signing
input uses) to the cache key. Now distinct package sets get distinct
cache entries, and two requests differing only in package_id order
share one entry (the signing input sorts them anyway).

Tests assert (a) different package sets yield different signatures
and a sig minted for set A fails verification on set B, (b)
order-independent package sets share a cache entry, (c) the existing
same-input cache reuse and epoch separation still hold.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
main bumped x/net to v0.54.0 and x/crypto to v0.51.0 (urlutil); the
sub-modules' go.sum needs the same indirect-dep entries to satisfy
'missing go.sum entry' checks under go test ./...
@ohalushchak-exadel ohalushchak-exadel force-pushed the ohalushchak-exadel/adcp-ed25519-auth branch from 5cd71b3 to f51dc1d Compare May 11, 2026 18:02
Comment thread docs/network-surface.md Outdated
Comment thread docs/network-surface.md Outdated
Comment thread reference/identity-agent/cmd/identity-agent/main.go
Comment thread reference/identity-agent/cmd/identity-agent/main.go Outdated
ohalushchak-exadel and others added 2 commits May 12, 2026 12:59
Addresses two of the spec gaps flagged on PR #114:

1. Replaces the static --tmpx-kid + --tmpx-pubkey-path config with
   --tmpx-encrypt-jwks-url + --tmpx-encrypt-jwks-ttl. The agent polls a
   buyer-published JWKS endpoint (e.g.
   api.staging.interchange.io/.well-known/jwks.json), filters by
   adcp_use=tmpx-encrypt, validates kty=OKP/crv=X25519/alg=HPKE-DHKEM-
   X25519-HKDF-SHA256, and uses the entry with the newest iat for each
   seal — so buyer-side key rotation propagates within the TTL window
   without operator intervention. Same JWKS file can publish
   adcp_use=request-signing keys; the store indexes both.

2. Adds --tmpx-priority for spec-conformant truncation: the comma-
   separated UID type ordering determines which identities survive when
   the 255-byte wire budget would otherwise be exceeded. Without it, an
   over-budget identity set returns a loud error — the spec is explicit
   that "default implementation MUST NOT truncate arbitrarily."

Wire-size math is pre-computed via tmproto.TmpxWireSize so the encoder
never produces a token that would have to be rejected post-seal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
selectTmpxEntries computed the wire-size budget against the kid
currently advertised by the JWKS, so a buyer rotation from a short kid
(e.g. 1 char) to the spec-max 8-char kid could push a previously-
fitting prefix above 255 bytes. The next seal at the new kid width
would silently overflow.

Budget against tmproto.TmpxMaxKidLen instead — always plan for the
worst-case prefix length the spec permits. Cost: ~5 fewer entry bytes
when the live kid is shorter; correctness: rotation between any kid
widths in [1, 8] is always safe.

tmpxKidMaxLen is exported as TmpxMaxKidLen so callers can use it in
size calculations without re-deriving the constant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread tmproto/keystore_jwks.go
@ohalushchak-exadel ohalushchak-exadel merged commit c996655 into main May 12, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants