Skip to content

RFC: Deterministic property fingerprints for registry-free identification #2195

@bokelley

Description

@bokelley

Summary

Add a property fingerprint (property_fid) — a deterministic identifier any agent can compute independently from a property's identifier type and normalized value, without calling the registry.

Inspired by USBN (Universal Standard Book Number), which solved an analogous problem: 60 million pre-ISBN book editions with no central registry. Their approach — deterministic identifiers computed from the content itself — removes the registry from the critical path.

Problem

The property catalog assigns property_rid (UUID v7) to every addressable property. This works for the top 100K properties that get crawled, resolved, and enriched — but the addressable internet has millions of properties, and the long tail never registers.

Small podcasters, local DOOH venues, community radio stations — these properties exist and take advertising, but nobody crawls their adagents.json (they probably don't have one), no member has resolved them, and no seed dataset covers them. Today, these properties can't participate in TMP because they have no property_rid. The registry is a gatekeeper whether we intend it to be or not.

Proposal

property_fid = "F" + crockford_base32(blake2s(identifier_type + ":" + normalized_value)[0:60bits])

Result: 13 characters. F prefix distinguishes from other identifier formats. Any agent can compute it locally with zero network calls.

Examples

Input Normalized property_fid
(domain, CNN.com) domain:cnn.com F4R7KN2PW8X3M
(ios_bundle, com.CNN.iphone) ios_bundle:com.cnn.iphone FQ9BT5JMVA6YH
(station_id, WCBS-FM) station_id:wcbs-fm FG3NR8DK2C7WP
(venue_id, JCDecaux-NYC-4521) venue_id:jcdecaux-nyc-4521 FXM5HT9QA4J2R

(Values illustrative — real hashes would differ.)

How it works

Registry-free path (long tail)

A buyer agent encounters venue_id:jcdecaux-nyc-4521 in a bid request. No property_rid exists.

  1. Agent computes property_fid locally: FXM5HT9QA4J2R
  2. Agent uses the fid for targeting, frequency capping, reporting
  3. If the agent later calls POST /api/registry/resolve, the catalog assigns a property_rid and records the fid as a computed alias
  4. All historical data keyed on the fid maps forward to the rid

The property participated in transactions before the registry knew about it.

TMP integration

{
  "request_id": "req_abc",
  "property_rid": null,
  "property_fid": "FXM5HT9QA4J2R",
  "property_type": "dooh",
  "placement_id": "screen_1"
}
  • If property_rid is present, use it (no change)
  • If only property_fid is present, match against fid-indexed packages
  • Responses include fid_only: true so buyers know the property is unverified

What this does NOT do

  • Replace property_rid — rid remains authoritative. Fid supplements when the catalog hasn't caught up.
  • Solve identity resolution — linking multiple identifiers to the same property is still the catalog's job.
  • Provide trust or authorization — fid says nothing about who can sell. Authorization still requires adagents.json.
  • Work for wildcards*.example.com is a matching rule, not a specific property.

Bonus: Normalization test vectors

Independent of fingerprints, the catalog's normalization pipeline should publish a canonical test vector table. Any implementation that normalizes identifiers should be able to verify correctness.

Input Type Raw Value Normalized
domain HTTPS://WWW.Example.COM/ example.com
domain www.example.com example.com
ios_bundle com.CNN.iPhone com.cnn.iphone
rss_url HTTPS://Feeds.Example.COM/podcast.xml https://feeds.example.com/podcast.xml
station_id WCBS-FM wcbs-fm
facility_id FCC:73953 fcc:73953

This should ship as a JSON fixture alongside the schema.

Open questions

  1. Should the resolve endpoint accept fids? Probably yes — agents with only a fid should be able to ask "do you know this one?"
  2. Should fids appear in the change feed? Adds ~13 bytes per event but lets non-computing consumers index by fid.
  3. Collision handling: 60 bits → birthday-bound collision probability of ~4.3×10⁻⁵ at 10M properties. When detected, log it and force resolution via rid.
  4. Should @adcp/client ship a computeFid(type, value) utility? I think yes — that's the whole point.

Full RFC

Full spec in specs/property-fingerprints.md — includes algorithm details, schema changes, wire format, and design rationale.

Metadata

Metadata

Assignees

No one assigned

    Labels

    claude-triagedIssue has been triaged by the Claude Code triage routine. Remove to re-triage.rfcProtocol change — auto-adds to roadmap board

    Type

    No type

    Projects

    Status

    No status

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions