docs(sitemap-author): schema v1.1 — 12 patches from twitter+hackernews PoC#1822
Merged
Conversation
…s PoC
Cross-validated against two PoCs (twitter 12 files / hackernews 10 files).
v1.1 changelog at top of file. 12 patches in 3 groups:
Group 1 — Scope/boundary (6 clarifications):
- §1.1 CJK token-per-char 30-50% higher than English; split sub-file rather
than relaxing 800-token limit (which would drift).
- §2.1 auth_strategy = primary strategy, not union; per-page contract_strength
expresses exceptions.
- §2.5 pitfalls.md is task-executor-level only; adapter-internal pitfalls
(queryId parsing, envelope unwrap) move to ~/.opencli/sites/<site>/notes.md.
- §2.5 pitfall id / trigger / workaround written from task-executor 1st-person
view ("when agent does X, ..."), not adapter-implementer view.
- §2.4 apis.md entry adds optional `notes:` field for GraphQL queryId path and
other meta info (still no URL / method / params / response — those stay in
endpoints.json).
- §2.2 page Linked APIs may be empty when endpoints.json is still being
collected; do not insert fake placeholder ids.
Group 2 — Reuse/compactness (3 structural):
- §2.2 + §4 partial pages: `page_id` with `_` prefix and `url_patterns: []`
for cross-page UI (e.g. _tweet_card.md). Referenced by other pages via the
existing `action:<id> in pages/_<name>.md` form. Eliminates duplication and
arbitrary "which page owns the like button" calls.
- §3 introduces Form B compact YAML for actions (~80 token each vs Form A
markdown ~250). Both forms remain valid; Form B is recommended when page
density would otherwise blow the 800-token budget.
- §3 drops action-level `verified_at` and `source` — file-level frontmatter
already covers both, repeated copies just drift.
Group 3 — Execution health/anchors (3 action-level):
- §3.3 cross-page UI primitive actions (the kind that live in partials)
may write Best/Fallback inline as adapter-first + DOM fallback within a
single action, rather than being forced up into a workflow Best/Fallback
pair. Decouples UI-primitive routing from task-level routing.
- §3.4 Recovery may include `adapter_health_update: <adapter> -> suspect`
directive. Consumption skill (opencli-browser-sitemap) writes the matching
workflow's adapter_health on the local overlay so the next agent skips the
broken Best path instead of re-running it. Write-side closure for the
failure → next-agent-avoidance loop.
- §2.2 testid marked optional; selector_pattern promoted to first-class
anchor with 5 acceptable shapes (id-anchored / sibling traversal / attribute
boundary / form name / ARIA) and explicit discouraged-anchor list
(nth-child, single-class grabs, text-content selectors). Old sites without
testid (HN, forums) are no longer second-class.
No code changes — pure schema reference. Both PoCs remain local; promotion to
references/site-memory/{twitter,hackernews}/sitemap/ comes once this lands.
- Form B delimiter table (`|` enum / `||` fallback / `;` sequential) to disambiguate `do:` and `recover:` parsing. - §3.3 like_tweet example updated to `||` fallback form. - §3.4 explicit note: adapter_health recovery (suspect → healthy) is read side, deferred to opencli-browser-sitemap skill spec.
This was referenced Jun 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Sitemap schema reference v1.1 based on twitter (12 files, @opencli-user) + hackernews (10 files, @opencli-质量官) parallel PoCs. 12 patches in 3 groups, all surfaced as findings during PoC content writing and cross-reviewed in thread.
Group 1 — Scope/boundary (6 clarifications, no format change)
auth_strategy= primary, not unionpitfalls.mdtask-executor-level only; adapter-internal →notes.mdapis.mdentry adds optionalnotes:fieldLinked APIsmay be empty when endpoints.json incompleteGroup 2 — Reuse/compactness (3 structural)
page_idwith_prefix +url_patterns: []verified_at/source(inherit file-level)last_verifiedalready coversGroup 3 — Execution health/anchors (3 action-level semantics)
_tweet_card.mdlike_tweet: forcing UI-primitive to workflow level makes "click like" its own workflow, mismatched granularityadapter_health_update: <adapter> -> suspectdirective in Recovery + consumption skill writes workflow health_tweet_card.mdRecovery had this as text; formalize as directive so next agent sees the suspect mark and skips broken Best pathtestidoptional;selector_patternfirst-class with 5 forms + discouraged listnth-child(<rank>)instability validated the discouraged-anchor listv1 → v1.1 migration
Both PoCs are currently local at
~/.opencli/sites/{twitter,hackernews}/sitemap/and written in v1 markdown form. After this PR lands:skills/opencli-sitemap-author/references/site-memory/{twitter,hackernews}/sitemap/Open question for PR review
@opencli-user proposed a two-tier file size limit (hard 800 for simple sites / soft 2000 for dense sites) — should this land in v1.1, or stick with hard 800 + the CJK split-file note? My current take: hard 800 is correct, soft tiers drift. But happy to revisit.
Reviewers
Per agreed split: @opencli-user + @codex-coder cross-review (this PR consolidates content from both their findings).
Test plan
npm run docs:buildclean