feat(registry): add source column to discovered_properties for property reconciliation#4125
Draft
feat(registry): add source column to discovered_properties for property reconciliation#4125
Conversation
…ty reconciliation
Resolves the known limitation in hosted-property-sync where removed
properties persisted indefinitely. Adds a `source` column
('crawler' | 'aao_hosted') to `discovered_properties` (migration 467) so
the hosted-property sync can identify which rows it wrote and reconcile
(delete) them when the publisher removes properties from their manifest.
Key design decisions:
- The hosted manifest is authoritative for the publisher's property list;
any row not in the current manifest is deleted, regardless of source.
- On conflict with a crawler-attested row (source='crawler'), the crawler's
source label, identifiers, and tags are preserved — origin-verified facts
take precedence over hosted-manifest values.
- Property upserts and the reconcile DELETE run inside a single transaction
guarded by a domain-scoped advisory lock to prevent concurrent-sync races.
Closes #4111
https://claude.ai/code/session_01P6jrAAgREAnLNQodRxEyfQ
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #4111
Summary
Resolves the known limitation in
hosted-property-sync.tswhere properties removed from an AAO-hosted manifest persisted in the registry until manually cleared. Adds asourcecolumn ('crawler' | 'aao_hosted') todiscovered_properties(migration 467) and upgrades the sync from additive-only to full reconcile.Key design decisions:
discovered_propertiesrow for the domain not in the current manifest is deleted, regardless ofsource. A publisher who removes a property sees it gone on next sync.source='crawler', the crawler'sidentifiers,tags, andsourcelabel are preserved — origin-verified facts take precedence over hosted-manifest values.pg_advisory_xact_lock(hashtext('dp:'||domain))to prevent concurrent-sync interleave races (same pattern asorg-intake-lock.ts).(name, property_type)keying: the DELETE usesunnest-based tuple matching so a property reclassified to a different type is correctly removed rather than silently left as a ghost row.Also corrects the factual contradiction in the
hosted-property-fed-index-sync.mdchangeset, which previously describeddiscovered_propertiesas additive-only.Non-breaking justification
Adds optional
source TEXT NOT NULL DEFAULT 'crawler'column todiscovered_propertieswith a safe default — existing crawler rows backfill as'crawler'with no rewrite. TheupsertPropertyinterface gains an optionalsourcefield; all existing callers that don't pass it get'crawler'. The registry API response shape is unchanged. Changeset is--empty(server-side sync logic, not the published AdCP protocol spec).Pre-PR review
statement_timeoutadded,identifiers/tagsguard fixed,syncedcounter after-await confirmed, throw condition analyzed and correct (gated onerrors > 0).(name, property_type)keying correct, source-scope removal from DELETE correct (crawler rows for in-manifest properties survive viaNOT EXISTSsubquery), behavioral commitment documented in code comments.Nits (not fixed, noted for reviewers):
source='adagents_json'rows (distinct from'crawler') would be overwritten to'aao_hosted'on conflict — pre-existing enum gap in the schema, out of scope for this PR.Session: https://claude.ai/code/session_01P6jrAAgREAnLNQodRxEyfQ
Generated by Claude Code