docs(lakebase-autoscale): canonical psycopg_pool + OAuthConnection pattern#488
docs(lakebase-autoscale): canonical psycopg_pool + OAuthConnection pattern#488dgokeeffe wants to merge 3 commits intodatabricks-solutions:mainfrom
Conversation
…nection pattern Restructure connection-patterns.md to match the official Databricks tutorial and databricks-ai-bridge reference implementation: - Pattern 1 (canonical, new): psycopg_pool.ConnectionPool + OAuthConnection subclass + max_lifetime=2700. Zero background threads, rotation via pool recycling. This is what docs.databricks.com's Lakebase Apps tutorial uses. - Pattern 2: SQLAlchemy do_connect event (was previously presented as the production pattern — now demoted to "alternative for apps already using SQLAlchemy async", with an explicit note it adds unnecessary complexity). - Pattern 3: Direct psycopg.connect for scripts/notebooks. - Pattern 4: Static URL for local dev. New explicit warnings: - config.token / oauth_token().access_token is WORKSPACE-scoped and will fail at Postgres login. Must use w.postgres.generate_database_credential(). - max_lifetime=3600 (the default) creates a race condition; use 2700 so the pool recycles 15 min before the 1-hour token expiry. - ENDPOINT_NAME env var must be set manually — Databricks auto-injects PGHOST/PGPORT/PGDATABASE/PGUSER/PGSSLMODE but NOT the endpoint path. Canonical sources cited: - docs.databricks.com/aws/en/oltp/projects/tutorial-databricks-apps-autoscaling - docs.databricks.com/aws/en/oltp/projects/external-apps-connect - github.com/databricks/databricks-ai-bridge (src/databricks_ai_bridge/lakebase.py) Co-authored-by: Isaac
…oss-language table The existing overview jumped straight into features. Readers arriving from "how do I use Lakebase from Python?" needed two things made explicit: 1. There is no separate Lakebase SDK for Python. You use databricks-sdk only for minting OAuth credentials; a standard Postgres driver does the actual queries. (This was implicit in the connection patterns doc but not called out up-front.) 2. Node/TypeScript has a convenience wrapper: @databricks/lakebase (re-exported by @databricks/appkit). Autoscaling-only, not Provisioned. Worth mentioning so JS/TS readers know it exists. Also added a cross-language summary table and an explicit "What NOT to do" list — most importantly flagging that WorkspaceClient().config.token is workspace-scoped and will be rejected at Postgres login. This is a trap several of us have fallen into. Co-authored-by: Isaac
|
Correcting my earlier comment — I cited 1. Auto-injected env vars: 6, not 5 2. 3. Sharpen the 4. One-line comment on the psycopg3 pin The Autoscaling tutorial uses psycopg3 throughout. Calling out why helps readers with psycopg2 muscle memory who'd otherwise try to swap drivers. 5. Pattern 2 Retracted
Apologies for the noise in the first pass. |
- Fix PGAPPNAME omission: 6 env vars auto-injected, not 5; note multi-resource caveat - Add psycopg3 pin comment explaining why psycopg2 won't work (no connection_class hook) - Strengthen open=False rationale: deprecated for AsyncConnectionPool, errors in psycopg 4.0 - Clarify @databricks/lakebase scope in cross-language table (Autoscaling only) Co-authored-by: Isaac
Summary
Restructures the
databricks-lakebase-autoscaleskill to lead with the canonical connection pattern from the official Databricks Apps + Lakebase tutorial, and adds an explicit framing of how the Python ecosystem fits together.What changed
connection-patterns.md— reordered and expanded:psycopg_pool.ConnectionPool+OAuthConnectionsubclass +max_lifetime=2700. Matches the official tutorial, the external app SDK guide, anddatabricks-ai-bridge. Zero background threads — rotation happens transparently via pool recycling.do_connect+asyncio.Taskrefresh pattern is now marked "alternative for apps already using SQLAlchemy async", with a note that it adds unnecessary operational complexity for the common case.psycopg.connect(scripts only) and static URL (local dev only) — unchanged in spirit, trimmed.open=False+ explicit lifespan).SKILL.md— new up-front overview section:@databricks/lakebaseas the Node/TS convenience wrapper (Autoscaling-only).WorkspaceClient().config.tokenis workspace-scoped and will fail at Postgres login. Must usegenerate_database_credential()for a Lakebase-scoped token.Why
connection-patterns.mdled with a SQLAlchemy + background-refresh loop, which works but is not what the official tutorial or reference implementations use.config.tokenvsgenerate_database_credential()distinction was buried; it's the Add initial skills for Databricks development #1 cause of "my connection works locally but fails in prod" bugs.max_lifetime=2700vs the 3600 default was implicit; the new doc explains why the default creates a race condition.Test plan
max_lifetime=2700rationale (15-min buffer before 1-hour expiry)@databricks/lakebaseREADMEThis pull request and its description were written by Isaac.