Add provider-normalized context views#16
Closed
fivetran-davefowler wants to merge 4 commits into
Closed
Conversation
- scope to the surfaces information_schema already has (TABLES, COLUMNS); drop
RELATIONSHIPS/METRICS/ENTITIES and their provider views (deferred; relationships
should later extend REFERENTIAL_CONSTRAINTS/KEY_COLUMN_USAGE, not a custom view)
- AGENTS.TABLES/COLUMNS use SELECT t.* over information_schema as the spine
(no hardcoded native column list; inherits whatever the account exposes)
- generic identity merge: left join every discovered {provider}_tables/_columns
view, enrichment columns appended under a <provider>_ prefix, aggregated to one
row per identity to prevent fanout; no hardcoded providers
- remove hardcoded memories_count/warnings_count; memory participates later by
publishing its own *_TABLES/*_COLUMNS view and is picked up automatically
- view creation is fail-soft (warn, never break ingestion)
- align osi_columns identity parsing with osi_tables; rename helper to _relation_identity_sql
- update SPEC, proposal (v1 scope + resolved decisions), root entries, and tests
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- comment that <provider>_ prefixing keeps t.* from colliding with enrichment - is_time_dimension only true for dimension_group with type time (not duration) - document name-based, case-folded identity matching in _merge_on - README: note AGENTS.TABLES/COLUMNS are enriched information_schema and are per-database (point workflows at the data's database) - proposal: reconcile Duplicate And Merge Policy with the v1 prefixed-merge model Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
75aee6d to
269c237
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces an
INFORMATION_SCHEMA-compatible context-view convention for Agents Schema.The core views are:
AGENTS.SCHEMATAAGENTS.TABLESAGENTS.COLUMNSEach one uses the matching
INFORMATION_SCHEMAview as its row spine viaSELECT t.*, then joins provider-normalized relations by object identity. That makes these views familiar to agents and existing SQL snippets that already knowINFORMATION_SCHEMA, while adding provider context in prefixed columns.Why
This adds a small amount of convention to the spec: providers that want generic integration can expose schema-, table-, and column-grain relations with predictable suffixes. The tradeoff is worth it for three reasons:
AI adoption: agents already have strong priors around
INFORMATION_SCHEMA. Teaching them to useAGENTS.SCHEMATA,AGENTS.TABLES, andAGENTS.COLUMNSinstead ofINFORMATION_SCHEMA.SCHEMATA,INFORMATION_SCHEMA.TABLES, andINFORMATION_SCHEMA.COLUMNSwhen available is a small mapping, not a new navigation model. That should make Agents Schema easier for agents to use and faster for teams to adopt.Provider integration clarity: providers get a clear convention for joining the common surface without giving up their own source tables or implementation details. They can publish normalized relations at the schema, table, or column grain and let the generic views handle discovery, prefixing, and object-identity joins.
Adopter utility: assuming the main use case is schema-based warehouse work, most questions start with schemas, tables, and columns: what exists, what matters, what a field means, and what context different tools have attached. These views make the common path useful immediately while still letting provider-specific detail live in provider-owned relations.
Provider convention
Providers can participate by publishing one or more normalized relations, as tables or views, named:
AGENTS.<PROVIDER>_SCHEMATAAGENTS.<PROVIDER>_TABLESAGENTS.<PROVIDER>_COLUMNSThe generic views discover those provider-normalized relations and append their fields under a
<provider>_prefix, such asdbt_description,lookml_ai_context, orosi_source_object_id.This gives providers a suggested integration convention without making the core views know about every provider source table. Benefits:
SCHEMATA/TABLES/COLUMNSentrypointsScope
v1 deliberately sticks to the surfaces
INFORMATION_SCHEMAalready has:SCHEMATA,TABLES, andCOLUMNS.It does not add generic
RELATIONSHIPS,METRICS, orENTITIESviews. Those are different object types, and adding them here would make this more of a semantic-model convention than an information-schema extension.Known limitation
INFORMATION_SCHEMAis per-database, soAGENTS.SCHEMATA/AGENTS.TABLES/AGENTS.COLUMNScover the database that holds theAGENTSschema. Multi-database coverage via account-level metadata can be handled separately.Verification
Note: tests assert generated SQL structure; they do not execute against Snowflake.