Skip to content

fix(adagents): inline-resolution path for publisher_properties selectors + publisher_domains[] fan-out#750

Open
bokelley wants to merge 2 commits into
mainfrom
claude/issue-749-publisher-properties-inline-resolution
Open

fix(adagents): inline-resolution path for publisher_properties selectors + publisher_domains[] fan-out#750
bokelley wants to merge 2 commits into
mainfrom
claude/issue-749-publisher-properties-inline-resolution

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented May 20, 2026

Summary

Closes Part 1 of #749. Layers inline-resolution on top of PR #753's publisher_domains[] fan-out + revoked_publisher_domains[] infrastructure (already merged to main).

_resolve_agent_properties now returns resolved property dicts (not selector dicts) for publisher_properties authorization_type, sourced from the parent file's top-level properties[] indexed by publisher_domain.

What changed

Concern Before After
publisher_properties return shape Selector dicts Resolved property dicts (inline-resolved against parent's properties[])
Per-domain lookup O(N) linear scan inside selector loop O(1) via pre-built domain → [property] index
Unknown selection_type Permissive — returned all candidates Fail-closed — returns []
Empty property_tags / property_ids Permissive — returned all candidates Fail-closed — returns []
Property missing property_id Silent collapse via dedup Fail-fast — raises AdagentsValidationError
filter_revoked_selectors post-pass in get_properties_by_agent Operated on selector dicts Removed (structurally redundant — revoked_top_level pre-filter handles it via the index)

Revocation

revoked_publisher_domains[] is honored transparently: the existing revoked_top_level pre-filter at the caller strips revoked-domain properties before they enter the per-domain index, so inline resolution naturally skips them. Two pre-existing tests (TestRevokedPublisherDomains::test_revocation_filters_compact_form_selectors, test_revocation_filters_singular_selectors) updated to assert on resolved property dicts under the new contract.

Performance

Cafemedia/interchange.io scale fixture: 6,843 inline properties × 6,800 child domains × selection_type=by_tag + property_tags=["raptive_managed"]. Test is wall-clock-bounded (time.perf_counter() < 2.0s) to catch O(N×M) regressions. Without the domain index, the equivalent linear scan would do ~46M comparisons.

Federated fallback

When inline resolution returns no match for a selector's domain, the selector is silently skipped — the resolver is sync and cannot perform HTTP fetches. The async federated fallback path lives in companion PR #752's helpers (fetch_agent_authorizations_from_directory, detect_publisher_properties_divergence).

Test plan

  • ruff check src/ — clean
  • mypy src/adcp/adagents.py — clean
  • pytest tests/test_adagents.py tests/test_public_api.py182 passed
  • Cafemedia scale fixture finishes in < 2s wall-clock
  • Pre-commit hooks (black/ruff/mypy/bandit) pass

Spec references

  • adcp#4827 — Resolution-paths inline rule (parent-file properties[] indexed by publisher_domain)
  • adcp#4504 / adcp#753 — publisher_domains[] compact form (already on main)
  • CLAUDE.md "no fallbacks" — authorization-decision paths fail-closed on unknown / empty inputs

Companion work

bokelley added a commit that referenced this pull request May 21, 2026
Tightens authorization-decision semantics in `_resolve_agent_properties`:

- Dedup key is now `(publisher_domain, property_id)` instead of bare `property_id`,
  so the same id under different parent domains stays distinct and missing-id
  properties no longer collapse to a single entry. Missing `property_id` now
  raises `AdagentsValidationError` (fail-fast per CLAUDE.md).
- New `_validate_publisher_properties_selector` rejects two illegal selector
  shapes per adcp#4827: mixing scalar `publisher_domain` with array
  `publisher_domains`, and pairing `selection_type='by_id'` with the compact
  `publisher_domains[]` fan-out form (no defined per-domain id partitioning).
- `_resolve_inline` fails closed on unknown `selection_type` (returns `[]`
  instead of all candidates) and on `selection_type='by_tag'` with empty
  `property_tags` — both are authorization decisions where a permissive
  fallback would silently grant access on schema drift.
- Documents that parent-file `revoked_publisher_domains[]` (when implemented)
  is honored transparently because revoked domains are pre-filtered out of the
  per-domain index before resolution.

Adds tests covering each rejection path, the new fail-closed behaviors, and
the missing-property_id fail-fast. Adds a 2.0s wall-clock bound to the
cafemedia scale test to catch O(N×M) regressions without depending on
pytest-timeout (not currently a project dep).
@bokelley bokelley force-pushed the claude/issue-749-publisher-properties-inline-resolution branch from 0b70d33 to 02f76c3 Compare May 21, 2026 13:10
@bokelley bokelley marked this pull request as ready for review May 21, 2026 13:52
bokelley added a commit that referenced this pull request May 21, 2026
… 2+3 of #749)

Squashed onto main to resolve conflicts with #753 (publisher_domains compact
form, revoked_publisher_domains) and main HEAD. Final state of this PR:

- fetch_agent_authorizations_from_directory: async wrapper for GET /v1/agents/
  {agent_url}/publishers (per adcp#4823 / adcp#4828). Returns AgentDirectoryLookup
  with spec-conformant envelope (agent_url, directory_indexed_at, publishers,
  next_cursor). Accepts include=["properties"] to request per-publisher
  property_ids[] (per adcp#4894).
- detect_publisher_properties_divergence: full (publisher_domain, property_id)
  set-diff when directory returns property_ids[]; falls back to count-only
  comparison against directories that haven't deployed adcp#4894 yet.
  max_concurrency=20 default semaphore caps concurrent fetches at managed-
  network scale. sample_size=200 default opt-in for full sweep.
- AgentPublisherEntry / AgentDirectoryLookup / PublisherDivergence /
  DivergenceReport dataclasses; DiscoveryMethod / PublisherStatus Literal enums.
- AAO_PUBLISHER_DIRECTORY_URL env var supported (matches salesagent convention).
- Tests use respx.mock against spec-shaped JSON payloads.

Closes #749 Parts 2 and 3. Companion to #750 (Part 1: inline-resolution).
aao-ipr-bot[bot]
aao-ipr-bot Bot previously approved these changes May 21, 2026
Copy link
Copy Markdown

@aao-ipr-bot aao-ipr-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right shape. Resolved-property dicts are what adcp#4827 actually requires from get_properties_by_agent; selector dicts were intermediate state. Fail-closed on unknown selection_type and empty selector criteria matches CLAUDE.md "no fallbacks" in authorization paths.

Things I checked

  • "Breaking wire shape?" — 352d1bb6 (PR #753, which introduced the selector-dict return) is on main but not in v5.6.0. The selector-dict shape has never been released, so feat(adagents): rather than feat!: is fine: no adopter ever saw it on the wire. release-please will bundle #753 + #750 in the same minor and adopters land directly on the resolved-property contract.
  • Public-API surface (src/adcp/__init__.py:26-29, 837, 839): exports unchanged, signatures unchanged, tests/fixtures/public_api_snapshot.json:355 not touched. Return-shape change only.
  • Revocation pre-filter at src/adcp/adagents.py:1453-1462 and :1541-1550 strips revoked-domain properties before the per-domain index is built, so revocation is honored by construction — the removed filter_revoked_selectors post-pass was structurally redundant.
  • Fail-closed on every selection_type arm: selector.get("property_tags", []) or [] collapses None, empty list, and missing key to the same empty set, then if not wanted_tags: continue (src/adcp/adagents.py:1324, 1333) bails uniformly across by_tag and by_id.
  • Dedup key (domain, property_id) is scoped per-agent — two agents authorized for the same property each get one copy each, which is correct.
  • ad-tech-protocol-expert: sound — selection_type ∈ {all, by_tag, by_id} is the complete spec enum from schemas/cache/3.0/core/publisher-property-selector.json; test_unknown_selection_type_returns_empty("by_category") is a synthetic unknown, not a spec value being silently dropped.
  • security-reviewer: clean — no authorization-bypass paths; dedup-key scoping correct; revocation pre-filter blocks revoked domains by construction; mid-iteration AdagentsValidationError on missing property_id is correct fail-fast posture, not a DoS vector.
  • Three pre-existing tests updated to the new contract (test_resolve_compact_form_via_get_properties_by_agent, test_revocation_filters_compact_form_selectors, test_revocation_filters_singular_selectorstests/test_adagents.py:3224-3431). Comments at L3401-L3403 and L3427-L3429 spell out why the new assertions are equivalent under the new contract, which is the right amount of explanation for future readers.

Follow-ups (non-blocking — file as issues)

  • Cite adcp#4827 in the resolver docstring, not just the commit message. _resolve_publisher_property_selectors docstring (src/adcp/adagents.py:1269-1292) lays out the rule without linking the spec issue that authorizes it. The commit message and PR description both name it; the code itself should too.
  • Type-guard property_tags / property_ids selector values. src/adcp/adagents.py:1323, 1332selector.get("property_tags", []) or [] falls through if the value is non-list-truthy (string "ctv" iterates to {"c","t","v"}). Structural validation lives at validate_publisher_properties_item, so the resolver inherits the loose typing intentionally, but an isinstance(..., list) guard ahead of the set comprehension fails closed for the malformed case and matches the resolver's overall typing discipline. (Same pattern exists in the pre-existing property_tags / property_ids authorization branches at L1230 and L1239 — fixing it consistently is the larger move.)
  • Domain index is rebuilt per-agent inside get_all_properties's loop. src/adcp/adagents.py:1475 calls _resolve_agent_properties per agent; every publisher_properties agent reconstructs the same top_level_properties → domain index. At N-agents × M-properties this wastes work. Lifting the index build up to get_all_properties (build once, thread through _resolve_agent_properties) closes the N-agent regression vector. The scale test only exercises one agent, so it doesn't catch this.
  • Document the "no federated fallback in this resolver" behavior on the public docstrings. get_properties_by_agent (src/adcp/adagents.py:1509-1528) and get_all_properties (L1427-L1441) now silently resolve to [] for any publisher_properties selector whose target domain isn't inlined in the parent file. A downstream sales-agent wiring this as the sole authorization check will under-authorize when authorization is federated-only. One sentence pointing at companion PR #752's fetch_agent_authorizations_from_directory is enough.

Minor nits (non-blocking)

  1. 2.0s wall-clock budget on the scale test. tests/test_adagents.py:1818 — indexed path runs in milliseconds; budget catches O(N×M) regressions which take seconds. Margin is comfortable but it's the only timing assertion in the file, and shared GitHub runners can stall. Bumping to 5.0s costs nothing and preserves the O(N×M) gate.
  2. Asymmetry between fan-out and resolver is defensible but worth pinning. _fanout_publisher_properties (src/adcp/adagents.py:1353-1386) drops malformed entries silently — structural validation lives at validate_publisher_properties_item. _resolve_publisher_property_selectors (:1339-1344) raises on a matching property missing property_id — authorization integrity demands fail-fast. The two policies are correct, but a one-line comment at the raise site noting "structural validation is upstream; authorization integrity is fail-fast here" stops the next reader from reading it as inconsistency.

LGTM. Follow-ups noted below.

…tors (#749 Part 1)

Layers inline-resolution on top of PR #753's publisher_domains[] fan-out
and revoked_publisher_domains[] support. _resolve_agent_properties now
returns resolved property dicts (not selector dicts) for publisher_properties
authorization_type, sourced from the parent file's top-level properties[]
indexed by publisher_domain.

- Pre-builds domain → properties index once per call so per-domain lookups
  are O(1) — avoids O(N×M) at cafemedia scale (6,843 properties × 6,800
  domains = ~46M comparisons under the old linear scan).
- Inline resolution honors revoked_publisher_domains[] transparently via
  the existing revoked_top_level pre-filter (revoked domains never enter
  the per-domain index, so they're skipped by construction).
- Fail-closed on unknown selection_type and empty selector criteria
  (property_tags=[] / property_ids=[]) per CLAUDE.md "no fallbacks" in
  authorization-decision paths. Fail-fast on properties missing the
  required property_id field.
- Cafemedia/interchange.io scale test (6,843 properties × 6,800 domains)
  is wall-clock-bounded to catch O(N×M) regressions.
- Two pre-existing TestRevokedPublisherDomains tests + the
  TestPublisherDomainsCompactForm resolution test updated to assert on
  resolved property dicts (the new contract) instead of selector dicts.
- Dead filter_revoked_selectors post-pass removed from
  get_properties_by_agent and get_all_properties — revocation is enforced
  upstream via the index, so the post-filter was structurally redundant.

Closes #749 Part 1. Federated fallback (when inline yields no match for a
domain) lives in companion PR #752's async helpers.
@bokelley bokelley force-pushed the claude/issue-749-publisher-properties-inline-resolution branch from 02f76c3 to d3a9ce2 Compare May 21, 2026 14:17
@bokelley bokelley enabled auto-merge (squash) May 21, 2026 14:17
@bokelley bokelley disabled auto-merge May 21, 2026 14:41
- Cite adcp#4827 in _resolve_publisher_property_selectors docstring.
- isinstance(list) guard on property_tags / property_ids in both
  publisher_properties resolution and the pre-existing property_tags /
  property_ids authorization branches — string-iterating-as-tags is now
  caught and resolves to [].
- Lift domain index build out of per-agent loop; _build_domain_index
  module helper used by both get_all_properties and get_properties_by_agent,
  index passed through _resolve_agent_properties → _resolve_publisher_
  property_selectors. O(N+M) instead of O(agents × N).
- Document "no federated fallback" on get_properties_by_agent and
  get_all_properties docstrings; point at PR #752's federated helpers.
- Cafemedia scale test wall-clock budget 2.0s → 5.0s for shared-runner
  margin.

Bot review: #750
Copy link
Copy Markdown

@aao-ipr-bot aao-ipr-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Follow-ups noted below. Inline-resolution wired through the right seam — revoked_top_level feeds _build_domain_index, so revocation is a structural property of the index rather than a post-pass, and filter_revoked_selectors falls out as redundant by construction.

Things I checked

  • _resolve_publisher_property_selectors (src/adcp/adagents.py:1299-1385) faithfully implements adcp#4827 §Resolution-paths inline path: lookup by publisher_domain, dispatch on selection_type ∈ {all, by_tag, by_id}, fail-closed on anything else. Empty property_tags=[] / property_ids=[] is consistent with the schema's minItems: 1 upstream — defense in depth, not divergence.
  • _fanout_publisher_properties (src/adcp/adagents.py:1388) implements adcp#4504's "logically equivalent to repeating the entry once per listed domain" — strips publisher_domains, preserves every other key.
  • Revocation transparency: revoked_top_level (adagents.py:1500-1508, 1604-1612) strips revoked-domain properties before index build, so the per-domain lookup in _resolve_publisher_property_selectors cannot resolve revoked properties. Matches adcp#4827's "resolves to the empty set for that domain."
  • Dedup key (publisher_domain, property_id) at adagents.py:1338,1380 — correct because property_id is publisher-scoped per publisher-property-selector.json's by_id description.
  • _build_domain_index is built once per public call — verified by test_get_all_properties_builds_index_once at tests/test_adagents.py:1944. Index threaded as parameter through _resolve_agent_properties so per-agent calls reuse it.
  • New isinstance(list) guards at adagents.py:1237-1240, 1249-1252 close the previously-latent "string iterated char-by-char as tags" footgun in the top-level property_ids / property_tags branches too — not just the publisher_properties resolver. test_malformed_property_tags_value_resolves_empty covers the by_tag selector case.
  • 6,843-property scale fixture (tests/test_adagents.py test_get_properties_by_agent_cafemedia_scale) — wall-clock-bounded at 5.0s. Catches gross O(N×M) regressions but is loose; see follow-ups.
  • code-reviewer: sound-with-caveats — no blockers. ad-tech-protocol-expert: sound-with-caveats — single named divergence on property_id optionality (follow-up below).

Follow-ups (non-blocking — file as issues)

  • property_id fail-fast vs. schema optionality. static/schemas/source/core/property.json does not put property_id in required — it's documented as optional. The raise at adagents.py:1375-1379 will collapse a spec-conformant file with selection_type: "all" matching ID-less properties. Two reasonable paths: tighten adcp's property.json to require property_id when referenced via publisher_properties (cleaner) or downgrade the SDK to skip-with-warning on ID-less properties (safer). The current SDK opinion is defensible under CLAUDE.md "no fallbacks" but goes beyond the spec — worth a tracking issue so we don't surprise an early adopter.
  • filter_revoked_selectors semantic drift. Still exported from adcp/__init__.py:31,840, but the resolved-property dicts that get_properties_by_agent now returns are no longer selector-shaped (they carry publisher_domain as a real property field, so the filter still mechanically works as a no-op-or-filter, but the docstring at adagents.py:1445-1459 reads as if it's meant for selector pipelines). Either deprecate or rewrite the docstring to pin the intended caller (_fanout_publisher_properties output, not public-API return values).
  • AdagentsValidationError propagation. The raise inside _resolve_publisher_property_selectors now flows out through verify_agent_for_property and fetch_agent_authorizations. Confirm the "one malformed property collapses the whole publisher" behavior is intended, and add a test asserting fetch_agent_authorizations skips a publisher with a malformed inline property rather than crashing the whole batch.
  • get_properties_by_agent lacks a _build_domain_index once-call assertion. test_get_all_properties_builds_index_once only covers get_all_properties. A symmetric test prevents a future refactor from re-introducing per-agent index rebuilds in the single-agent path.

Minor nits (non-blocking)

  1. Scale budget mismatch. PR body says time.perf_counter() < 2.0s; tests/test_adagents.py test_get_properties_by_agent_cafemedia_scale asserts < 5.0. The 5.0s margin survives slow shared runners but is loose enough that a partial O(N×M) regression could still pass. Either tighten to the originally-claimed 2.0s with retries on cold cache, or update the PR body to reflect the 5.0s reality.
  2. property_ids negative-path coverage gap. tests/test_adagents.py has test_malformed_property_tags_value_resolves_empty but no analog for authorization_type: "property_ids" with a string value (the new guard at adagents.py:1238) or for selection_type: "by_id" with property_ids: "abc". One-line tests close the regression window.
  3. Dedup docstring at adagents.py:1330 could note why: "overlapping by_tag/by_id selectors on the same publisher_domain can match the same property." Worth two extra words.
  4. Conventional-commit wording. Squash-merge title is fix(adagents): but this adds new resolution semantics on top of unreleased PR #753 — closer to feat:. Not load-bearing since nothing here is in a tagged release (git tag --contains 352d1bb6 is empty), so the major-version question doesn't arise. Worth a notable mention because release-please will read whatever lands.

Approving on the strength of the spec-correct fail-closed semantics plus the once-per-file index pattern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant