refactor(skills): Replace MCP tools with CLI/SDK and add test infrastructure by QuentinAmbard · Pull Request #464 · databricks-solutions/ai-dev-kit

QuentinAmbard · 2026-04-13T08:55:48Z

Summary

This PR simplifies the databricks-skills by replacing MCP tool references with standard CLI commands and SDK patterns. This makes the skills more portable and consistent across different AI coding assistants.

Changes

MCP to CLI/SDK Migration (10 commits):

Replaced all MCP tool calls with Databricks CLI commands and Python SDK patterns
Skills now use databricks CLI commands instead of MCP tools
SQL queries use databricks sql execute command
Python SDK patterns use WorkspaceClient directly
Preserved legitimate MCP references for External MCP Server feature in agent-bricks

Skills Updated:

databricks-agent-bricks (supervisor agents)
databricks-aibi-dashboards (AI/BI dashboards)
databricks-app-python (Databricks Apps)
databricks-dbsql (SQL features)
databricks-genie (conversation API)
databricks-jobs (task types)
databricks-lakebase (Lakebase instances)
databricks-model-serving (all deployment patterns)
databricks-spark-declarative-pipelines
databricks-unity-catalog
databricks-unstructured-pdf-generation
databricks-vector-search
databricks-zerobus-ingest

Test Infrastructure (1 commit):

Added .tests/ folder with integration tests for Python scripts
Tests for databricks-agent-bricks/manager.py (MAS operations)
Tests for databricks-genie/conversation.py (Genie API)
Test runner script with HTML/XML report generation
All 11 unit tests pass

Test plan

Unit tests pass: python databricks-skills/.tests/run_tests.py --unit
Integration tests require Databricks connection with Agent Bricks and Genie enabled
Verify skill markdown files render correctly

🤖 Generated with Claude Code

Adds a release channel selection during installation allowing users to choose between stable (default) and experimental branches. When experimental is selected: - Displays feedback request with links to issues/discussions - Re-downloads install.sh from the experimental branch - Re-executes with --experimental flag (preserving other args) Features: - New --experimental flag and DEVKIT_CHANNEL env var - Interactive radio selector for channel choice - Channel shown in summary and completion messages - Feedback reminder at end of experimental installs Closes #468 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Automates releases while ensuring the experimental branch stays in sync: - Triggers on VERSION file changes on main - Checks if experimental is behind main - Creates sync PR (main → experimental) if needed - Auto-merges if no conflicts, blocks release if conflicts exist - Clear error messages with PR links when blocked - Creates git tag and GitHub Release when sync is complete Part of #468 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

When release is blocked due to conflicts between main and experimental, the error message now includes: - Step-by-step instructions for resolution - A ready-to-use Claude Code prompt that: - First analyzes commits in experimental to understand intent - Reviews conflicted files from both sides - Resolves by keeping both changes when possible - Asks for human confirmation when resolution isn't obvious 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…apps skills - databricks-agent-bricks: Use CLI for KA/Genie, add manager.py for MAS operations - databricks-aibi-dashboards: Use databricks lakeview CLI commands - databricks-app-python: Update to use CLI-based deployment This is part of the effort to simplify skills by removing MCP tool dependencies and using Databricks CLI directly where possible. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add conversation.py script for Genie Conversation API (ask_genie) - Update SKILL.md to use databricks genie CLI commands - Update spaces.md with CLI-based export/import/migration workflows - Update conversation.md to use conversation.py script 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- databricks-lakebase-autoscale: Remove MCP section, expand CLI commands - databricks-lakebase-provisioned: Remove MCP section, expand CLI commands 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…and dbsql skills - databricks-model-serving: Use databricks CLI for endpoints and workspace ops - databricks-unity-catalog: Use databricks fs CLI for volume operations - databricks-dbsql: Update guideline to use CLI instead of MCP 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Remove MCP Tools section from SKILL.md (manage_vs_endpoint, manage_vs_index, query_vs_index, manage_vs_data) - Update Common Issues to remove MCP-specific truncation issue - Update Notes section to reference CLI/SDK instead of MCP - Update end-to-end-rag.md: replace MCP tools table with CLI commands - Update troubleshooting-and-operations.md: replace MCP tool references with CLI 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…mands - Rename Option C from "MCP Tools" to "CLI" approach - Replace references/2-mcp-approach.md with 2-cli-approach.md (full rewrite) - Update Post-Run Validation section to use `databricks pipelines` CLI - Update all workflow references from MCP to CLI/SDK - Update 1-project-initialization.md reference 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- databricks-config: Rewrite to use `databricks auth` CLI commands - databricks-docs: Update references from MCP to CLI/SDK - databricks-metric-views: Replace MCP tools with SQL CREATE/DESCRIBE commands - databricks-execution-compute: Replace MCP tools with CLI job commands - databricks-unity-catalog/6-volumes: Replace MCP tools with `databricks fs` CLI - databricks-unity-catalog/7-data-profiling: Replace MCP tools with SQL QUALITY MONITOR 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- 5-development-testing.md: Update workflow from MCP to CLI - 8-querying-endpoints.md: Replace MCP tools section with CLI commands - SKILL.md: Update reference table descriptions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- README.md: Update description and diagram to reference CLI/SDK - install_skills.sh: Update comment describing skills - databricks-app-python: Rename 6-mcp-approach.md to 6-cli-approach.md - databricks-jobs/task-types.md: Remove MCP tool note - databricks-model-serving: Replace MCP tools with CLI commands - 1-classical-ml.md: CLI for querying endpoints - 3-genai-agents.md: CLI for testing and querying - 6-logging-registration.md: CLI for running scripts - 7-deployment.md: CLI for job creation and management - 9-package-requirements.md: Notebook commands instead of MCP - databricks-unstructured-pdf-generation: Python script pattern - databricks-zerobus-ingest: CLI workflow instead of MCP execute_code Note: MCP references in databricks-agent-bricks (External MCP Server feature) and databricks-mlflow-evaluation (MLflow MCP server) are legitimate product features and remain unchanged. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Added test infrastructure for Python scripts in databricks-skills: - .tests/conftest.py: Pytest fixtures for Databricks connection - workspace_client: Session-scoped WorkspaceClient - warehouse_id: Finds running SQL warehouse - Custom markers for integration tests - .tests/test_agent_bricks_manager.py: Tests for supervisor agent CLI - Unit tests for _build_agent_list helper (all agent types) - Integration tests for MAS lifecycle (list, find, get) - .tests/test_genie_conversation.py: Tests for Genie conversation CLI - Unit tests with mocks for ask_genie function - Tests for timeout, failure handling, conversation tracking - Integration tests for live Genie Space queries - .tests/run_tests.py: Test runner script - Supports --unit and --integration flags - HTML and JUnit XML report generation - Colored terminal output with summary Tests cover the remaining Python scripts in skills: - databricks-agent-bricks/manager.py - databricks-genie/conversation.py All 11 unit tests pass. Integration tests require Databricks connection. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…xample queue support Changes: - Renamed manager.py → mas_manager.py for clearer naming - Added example question management functions: - add_examples(): Add examples to ONLINE MAS - add_examples_queued(): Queue examples for when MAS becomes ONLINE - list_examples(): List all examples for a MAS - Integrated with TileExampleQueue from databricks-tools-core - Updated all documentation references to use mas_manager.py - Updated test imports to use mas_manager module This allows users to add example questions immediately after creating a MAS, even before it finishes provisioning. Examples are automatically added when the endpoint becomes ONLINE. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Add installation section with uv (preferred) and pip fallback for installing databricks-tools-core library. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- mas_manager.py: Inline all agent_bricks functionality, use raw HTTP with WorkspaceClient for auth only (no core imports) - pdf_generator.py: New self-contained script using CLI for uploads (databricks fs cp) instead of SDK-based volume operations - Update SKILL.md files to reflect self-contained scripts - Update tests to work with new modules Skills now only require: - databricks-sdk (for auth in mas_manager) - requests (for HTTP in mas_manager) - plutoprint (for PDF generation) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Move mas_manager.py to databricks-agent-bricks/scripts/ - Move conversation.py to databricks-genie/scripts/ - Move pdf_generator.py to databricks-unstructured-pdf-generation/scripts/ - Update all markdown references to use scripts/ path 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Use --json syntax for creating UC objects (catalogs, schemas, volumes) - Document correct JSON format for each create operation - Add SQL execution alternative for creating objects - Fix incorrect positional args syntax in multiple skill files The --json syntax is the most reliable across CLI versions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Use --json syntax for catalogs, schemas, volumes create commands - Remove incorrect positional argument examples - Simplify volume example (remove external variant) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

DatabricksEnv does not exist in current databricks-connect versions. Updated all skills to use: - DatabricksSession.builder.serverless(True).getOrCreate() - Local dependency installation via uv/pip 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add Post-Generation Validation section with CLI SQL examples - Update troubleshooting.md with CLI-based validation queries - Remove in-script .show() calls from generate_synthetic_data.py - Validate data using `databricks sql execute` instead of DataFrame API 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Remove Python import patterns (not usable by agent) - Focus on CLI: write HTML to temp file, run script - Remove redundant sections and patterns 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add field format requirements: all items need unique 32-char hex UUID id - Document that question/sql/content fields must be arrays of strings - Add example showing correct format - Add trash-space command for deleting spaces 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Document correct serialized_space format with ID requirements - All items require 32-char hex UUID id field (uuid.uuid4().hex) - Text fields (question, sql, content) must be arrays, not strings - Fix CLI syntax: use title (not display_name), serialized_space (not table_identifiers) - Add trash-space command documentation - Remove redundant spaces.md file 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Create standalone compute.py with all logic inlined (no external deps) - Filter clusters to UI/API sources only (interactive, human-created) - Add page_size=100 for faster cluster listing - Use proper SDK types (JobEnvironment, Environment, timedelta) - Add integration tests for compute.py CLI - Merge Genie conversation.md into SKILL.md - Fix CLI commands in SKILL.md (databricks warehouses) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add CRITICAL widget version requirements table - Document mandatory validation workflow (test queries before deploy) - Fix CLI commands: discover-schema requires CATALOG.SCHEMA.TABLE format - Fix lakeview create: use --display-name, --warehouse-id, --serialized-dashboard - Add Genie space linking via uiSettings.genieSpace - Add design best practices section - Remove duplicate 3-examples.md (content in 4-examples.md) - Update file references to match correct numbering 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add Quick Reference table at top for common CLI commands - Add Step 4 design phase with filter-to-dataset mapping - Add filter scope rule to checklist (filters only affect datasets with field) - Clarify percentage format (0-1 vs 0-100) with fix options - Add data variance guidance for trend charts - Condense expression examples using [option|option] notation - Remove redundant ASCII workflow diagram (steps below are clearer) - Link dataset parameters to filter widget documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Documents how to run unit and integration tests for skill scripts. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Fix status table: NOT_READY instead of PROVISIONING - Expand Quick Reference with complete working example - Add note about running from skill folder - Include ID lookup commands in Quick Reference 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

SKILL.md: Quick Reference only (commands for KA, MAS) - Clarify Genie is in databricks-genie skill - Add explanation of what Agent Bricks are - Remove Genie section (handled by separate skill) 1-knowledge-assistants.md: Source types + troubleshooting only 2-supervisor-agents.md: UC functions, MCP, descriptions, examples, troubleshooting Removed 500+ lines of duplicate content. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Tested by creating skywest_ops_analytics pipeline. Found and fixed: 1. library type: Use `file` not `notebook` for raw SQL files - {"notebook": ...} → {"file": ...} - Fixes "Only SQL, Scala and Python notebooks are supported" 2. CLI commands: - `list` → `list-pipelines` - `--pipeline-id ID` → positional `PIPELINE_ID` for all commands - `workspace ls` → `workspace list` 3. Validation: Use `discover-schema` instead of manual SQL - Returns schema, row counts, sample data, null counts in one call - Much better than running COUNT(*) queries 4. Added troubleshooting entry for file vs notebook error 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

PDF Generator: - Rewrite to separate conversion from upload (local HTML→PDF, then databricks fs cp) - Add parallel folder conversion with ThreadPool (4 workers) - Smart skip: only reconvert if HTML newer than PDF - Remove --json flag, simplify CLI - Add pdf_eval_questions.json format for KA testing - Update tests for new API CLI Fixes (across all skills): - databricks sql execute → databricks experimental aitools tools query - workspace ls → workspace list - All positional pipeline args (not --pipeline-id) Knowledge Assistants: - Add evaluation questions section referencing pdf_eval_questions.json 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Remove catalog from all query examples (use schema.table format) - Add --dataset-catalog and --dataset-schema CLI options to create command - Update documentation to explain default catalog/schema approach 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Comprehensive CLI audit fixes: - Fix positional arguments vs flags (postgres, database, system-schemas, knowledge-assistants, storage-credentials) - Add cluster/warehouse create examples with tags to execution-compute - Add --cluster-sources UI,API filter to exclude job clusters (faster) - Fix genie export/import commands (use get-space --include-serialized-space) - Standardize tag instructions: "include" for inline JSON, "after creation" for workspace-entity-tag-assignments Resources with tags: - Jobs, Pipelines: inline "tags" in create JSON - Clusters: inline "custom_tags" in create JSON - Warehouses: inline "tags.custom_tags" array in create JSON - Dashboards, Apps, Genie: workspace-entity-tag-assignments API - Serving Endpoints: patch API with add_tags 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Restore databricks-mcp-server/ and databricks-tools-core/ directories - Make MCP server installation optional in install.sh (default: skip) - Add --mcp and --mcp-path CLI options for non-interactive install - Add DEVKIT_INSTALL_MCP and DEVKIT_MCP_PATH env vars - Skills-only install is faster (no venv setup required) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Save installation config (tools, profile, scope, skills, MCP) to .install-config - On reinstall, show recap of previous settings with option to reuse or reconfigure - Use hash-based schema validation: auto-detects when new config fields are added - Silent/non-interactive modes auto-apply previous config when available - Config file stored in scope-appropriate location (project or global) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Rename .install-config to .ai-dev-kit-install-config - Remove old .skills-profile mechanism (now unified in config file) - Add HAS_PREVIOUS_CONFIG flag for pre-selection mode - Pre-select all prompts from saved values when reconfiguring: - Tools: shows "previous" hint on saved selections - Databricks profile: pre-selects saved profile - Scope: pre-selects project/global - Skills: pre-selects skill profiles - MCP: pre-selects install option - Simplify "Keep this configuration? (Y/n)" prompt - Make header/prerequisites more compact (single line) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Remove version check (always reinstall) - Remove extra blank line after experimental download message - Add "previous" hint to all pre-selected options from saved config: - Scope, MCP install, skill profiles 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Quote all values in save_config to handle spaces correctly - Replace source with grep-based parsing (no code execution risk) - Any config error silently falls back to fresh install 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

… 3.12 - Remove duplicate --mcp-path case in argument parser - Remove dead INSTALL_MCP=true assignment (was immediately overwritten) - Remove duplicate MCP server line in summary - Remove redundant install_mcp_server call (setup_mcp already handles it) - Use Python 3.12 instead of 3.11 for venv creation - Add --allow-existing to install_mcp_server venv creation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…pport Cleanup (~200 lines removed): - Remove dead functions: install_mcp_server(), check_sdk_version(), prompt_mcp_path() - Refactor prompt_scope() and prompt_mcp_install() to use radio_select() New feature - Claude profile env: - Add write_claude_env() to set DATABRICKS_CONFIG_PROFILE in .claude/settings.json - Only prompt for profile when Claude + project scope (not global) - Reorder flow: tools → scope → profile (so we know scope before asking profile) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

The --dataset-catalog and --dataset-schema CLI flags only fill in missing parts of a query — they do NOT override catalog/schema hardcoded in the FROM clause. Dashboard queries must use bare table names only (e.g., "FROM trips", not "FROM nyctaxi.trips"). - SKILL.md: rewrite note with ✅/❌ examples and a "why" explanation - 4-examples.md: update example queries to use bare table names - 3-filters.md: update example query to use bare table name 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

SKILL.md becomes a dense hub with one worked CLI example per concept (projects, branches, endpoints, credentials, reverse ETL). Deep-dive subfiles cover internals, limits, and advanced CLI, with an "SDK equivalents" section at the bottom of each. connection-patterns.md stays SDK-based since in-process OAuth token refresh is the one legitimate runtime SDK use case. Also fixes CLI bugs found during live testing: create-project and generate-database-credential take positional args (not flags); default endpoint is named primary (not ep-primary); duration fields use suspend_timeout_duration / history_retention_duration (not _seconds). Co-authored-by: Isaac

SKILL.md and references/2-serverless-job.md now lead with databricks jobs submit (the one-shot create+run CLI primitive) instead of the defunct MCP execute_code wrapper that the reference file used to point at. Full flow documented: upload → submit → poll get-run → fetch get-run-output, including the non-obvious gotcha that get-run-output takes the task run_id (.tasks[0].run_id), not the parent run_id from submit. scripts/compute.py gains --environments flag with dict-or-typed normalization so the standalone script can install pip dependencies (previously impossible from CLI — "client": "4" deps had no path). Interactive cluster section reduced to an "avoid by default" callout in SKILL.md; the raw-CLI cluster list and create patterns move into references/3-interactive-cluster.md alongside the existing script wrappers. SQL Warehouses section in SKILL.md expanded from create-only to the full CRUD surface (create, list, find, get, start, stop, edit, delete) with live-verified min_num_clusters/max_num_clusters and --no-wait gotchas. Co-authored-by: Isaac

job_extra_params={"environments": [...]} was broken both ways: passing dicts (the documented shape) crashed in the SDK's jobs.submit because it serializes each list element via .as_dict(); passing typed JobEnvironment crashed earlier trying to read environment_key with .get(). Neither path worked. Normalize extra["environments"] to List[JobEnvironment] once at the top of the submit path: dicts get wrapped (nested spec dict → typed Environment), typed objects pass through, anything else raises TypeError before hitting the SDK. env_key for the task binding is read off the canonical typed object. Adds TestServerlessJobExtraParams integration-test class covering the four cases: dict input, typed input, no environments (default path), malformed entry. Previously there was zero coverage of job_extra_params, which is how the bug landed. All four pass live (≈110 s for the class). Co-authored-by: Isaac

The --dataset-catalog / --dataset-schema guidance tells you what to do but not why. Clarify that bare table names exist so the serialized dashboard can be re-installed on a different catalog.schema without rewriting queries. Co-authored-by: Isaac

Three skills previously documented the dead get_table_stats_and_schema MCP function or had related gaps: - databricks-metric-views: swap the MCP call for `databricks experimental aitools tools discover-schema` and note that deeper distribution probes go through the `query` subcommand. - databricks-genie: same replacement in Step 1 "Understand the Data", plus delete the bogus `databricks sql exec` calls (no such subcommand exists) in favor of `query`. - databricks-aibi-dashboards: expand Step 2 exploration guidance so the design decisions (widget vs. table, KPI vs. trend chart, trend granularity, filter options) are explicitly tied to what to probe (cardinality, top values, numeric distribution, trend viability). Keeps the skill conceptual rather than prescribing SQL the agent can already write. Co-authored-by: Isaac

databricks fs on CLI v0.296 requires the dbfs: scheme prefix for UC Volume paths. Without it the CLI treats the path as local filesystem and errors with `no such directory`. Fix every fs example pointed at /Volumes/... in the PDF, UC, and SDP skills; also tighten the UC examples to use -r and --overwrite consistently, and clarify that -r copies the source directory's contents (not the directory itself). databricks workspace import-dir on redeploys silently skips files that already exist, so updates never reach the workspace and the app keeps running the old version. Add --overwrite to every import-dir example in the app skill's 4-deployment.md and 6-cli-approach.md. Also flag the first-ever-deploy gotcha on the redeployment recipe (the workspace delete line errors when the target dir doesn't exist yet). Fix the PDF skill's volumes-create troubleshooting row — it passed a single dotted arg (`catalog.schema.volume`) where the CLI wants four positional args (`CATALOG SCHEMA NAME MANAGED`). All corrected forms live-verified against the workspace once. Co-authored-by: Isaac

scripts/conversation.py was a 171-line Python glue wrapper around client.genie.start_conversation_and_wait and client.genie.get_message with manual polling. The CLI now exposes all the primitives directly (start-conversation, create-message, get-message, get-message-attachment-query-result), and start-conversation has a built-in --no-wait / --timeout LRO flag. Document the three-command flow end-to-end and delete the script. No external Python callers (only SKILL.md pointed at it). Also in this commit: - Fix the Export/Import quoting inconsistency: genie_space.json on disk is now a parsed object (not a JSON-string blob). Export unwraps with `jq '.serialized_space | fromjson'`; import and update both stringify consistently with `jq -c '.' | jq -Rs '.'`. - Add two troubleshooting rows: slow answers / query timeouts (warehouse sizing) and wrong/empty answers (example_question_sqls + text_instructions). - Drop the redundant serialized_space "Structure" skeleton — its information is a strict subset of the Complete Example, now renamed "Example" with the top-level keys called out in the lead-in. All three primitives live-verified against a real Genie Space on the workspace (NordWind Fleet Analytics): start-conversation → poll get-message → get-message-attachment-query-result (columns + rows) → create-message for follow-up. Co-authored-by: Isaac

SDP skill had three concrete bugs that bit an agent running a real pipeline update end-to-end: 1. references/2-cli-approach.md claimed "file" libraries could point to a directory. They can't — the API errors with "Paths must end with .py or .sql". The correct shape for a folder is {"glob": {"include": "<dir>/**"}}. Fixed the example and added a troubleshooting row for the exact error string. Live-verified with a pipelines update round-trip. 2. No documented flow for editing an existing pipeline. Added a dense "Updating a Pipeline" section covering re-upload + start-update. Key gotcha: pipelines consume raw FILE entries, so re-imports need --format RAW --overwrite. --format SOURCE --language SQL|PYTHON creates a workspace NOTEBOOK (deprecated for pipelines) and fails on an existing FILE path with "type mismatch (asked: NOTEBOOK, actual: FILE)". Live-verified both failure and success modes. Added troubleshooting row. 3. Contradictory streaming read guidance — SKILL.md said FROM stream(table), 4-dlt-migration.md showed FROM STREAM table. Both parse, but the function form is the canonical one. Reworked the troubleshooting row to spell out when each form applies and flag FROM STREAM table as legacy DLT compatibility. Bonus: pipelines list-pipeline-events returns a bare array, not {"events": [...]} — skill previously showed the raw command with no output shape hint. Replaced with a ready jq pattern that surfaces just ERROR/WARN entries; agent had written two failing Python one-liners trying to guess the shape. Also simplified databricks-unity-catalog SKILL.md to show the positional form for schemas create and volumes create (what the help text documents as canonical) instead of the --json form that was redundant with the positional CLI. Co-authored-by: Isaac

Script invocations in SKILL.md and references (python scripts/X) previously assumed the reader was running from the skill's install folder. Agents running from an arbitrary project cwd hit "No such file or directory" errors — the agent-bricks, execution-compute, and pdf-generation skills all trip the same way. Switch to the <SKILL_ROOT> literal token for every script invocation and add a one-line convention note at the top of each affected SKILL.md and reference file: > <SKILL_ROOT> = the directory containing this SKILL.md; resolve to > the absolute install path (e.g. ~/.claude/skills/<skill-name>). Rewrote: - python scripts/compute.py ... → python <SKILL_ROOT>/scripts/compute.py ... - python scripts/pdf_generator.py ... → python <SKILL_ROOT>/scripts/pdf_generator.py ... Also fixed a stale markdown link in the SDP skill whose display text said "examples/exploration_notebook.py" but whose path was "scripts/...". databricks-agent-bricks script references come in a separate commit. Co-authored-by: Isaac

The skill told readers to call create-knowledge-source with four positional args (PARENT DISPLAY_NAME DESCRIPTION SOURCE_TYPE) alongside --json. The CLI rejects that combination: Error: when --json flag is specified, provide only PARENT as positional arguments. Provide 'display_name', 'description', 'source_type' in your JSON input. Only two forms actually work (verified live on the workspace): 1. PARENT + --json '{display_name, description, source_type, files|index|...}' 2. positional-only (no --json) — but then there's nowhere to pass files.path / index.index_name, so this form only works for source types that need no extra body, which today is none. Updated SKILL.md and 1-knowledge-assistants.md to show the single working shape: PARENT positional + everything else in --json. Added the display_name / description fields inside each example body. Co-authored-by: Isaac

Three concrete bugs in scripts/mas_manager.py triggered by a real agent session: 1. get_mas (L481) and update_mas (L531) read instructions from mas_data.get("instructions") — wrong nested level, always empty. The GET response nests it on tile: mas_data.tile.instructions. Consequence: update_mas(tile_id, name="...") without an explicit instructions= arg wiped the existing instructions on every call. Verified the correct path live: "instructions_len: 232" vs 0 before. 2. add_examples_queued span up an in-process daemon thread that polled get_endpoint_status every 30s. When the CLI process exited, the thread died and examples were never added — silent data loss. Removed add_examples_queued, TileExampleQueue, get_tile_example_queue, the _tile_example_queue singleton, and the now-unused threading / Tuple imports. 3. Replaced the broken queue with a wait_for_online flag on add_examples (CLI: --wait). Blocks and polls every 30s for up to 15 min (covers the ~10 min NOT_READY -> ONLINE wait after create_mas or a big update_mas, with headroom). No background queue — the caller process must stay alive for the wait. Also live-verified that the MAS PATCH endpoint is NOT partial: missing `name` returns 400 Missing required field, missing `agents` returns 400 "At least one BaseAgent must be provided". update_mas already handles this internally (fetches existing + merges), so the full-replace reality stays an internal detail of the HTTP layer — callers see a partial-update-shaped API. Skill doc updates: - SKILL.md: reorder list_mas to the top of the check/manage block with a one-liner describing the return shape. - SKILL.md: flag the ~10min NOT_READY wait on add_examples with --wait. - SKILL.md: fix status legend from "(2-5 min)" to "up to ~10 min". - 2-supervisor-agents.md: replace the dual add_examples / add_examples_wait block with a single add_examples [--wait] example. - SKILL.md also includes the KA create-knowledge-source fix from the previous commit's companion page (PARENT + everything-in-json). Co-authored-by: Isaac

Lakebase Autoscaling is the canonical path for all new Lakebase work (autoscaling, branching, scale-to-zero, point-in-time restore). The Provisioned skill covers the predecessor fixed-capacity tier; keeping both causes agents to spend time deciding between them or picking the older one. Delete the Provisioned skill and point everything at autoscale. Files deleted: - databricks-skills/databricks-lakebase-provisioned/SKILL.md - databricks-skills/databricks-lakebase-provisioned/connection-patterns.md - databricks-skills/databricks-lakebase-provisioned/reverse-etl.md Cross-references updated: - install_skills.sh: drop from DATABRICKS_SKILLS list, description map, and reference-files map. - README.md: replace the Provisioned bullet with a Lakebase Autoscale bullet under the same Development & Deployment section. - databricks-python-sdk/SKILL.md, databricks-app-python/SKILL.md: redirect the "Related Skills" link to databricks-lakebase-autoscale. - databricks-lakebase-autoscale/SKILL.md: drop the now-meaningless "Provisioned vs Autoscaling" comparison table and the predecessor link. Keep the one prose mention in computes.md explaining CU RAM sizing context — that's justification, not a link. Co-authored-by: Isaac

Three independent doc improvements surfaced while exercising the skills end-to-end: databricks-aibi-dashboards SKILL.md Add a dense Statement Execution API fan-out snippet to Step 2 so multiple discovery probes (cardinality, top values, distribution, trend viability) run in parallel instead of serializing through `tools query`. Submit with `wait_timeout:"0s"` returns a `statement_id` immediately; `databricks api get /api/2.0/sql/statements/$SID` polls for state ∈ PENDING|RUNNING|SUCCEEDED|FAILED|CANCELED|CLOSED. Live-verified against TPC-H samples — 5 probes in 17s wall time. databricks-genie SKILL.md Same parallel-probe snippet in Step 1 (gated on "if you don't already know the data" so it's skippable). Then add two missing serialized_space rules to Field Format Requirements that previously caused 3+ retry rounds on space creation: - IDs must be unique across all three lists combined (text_instructions / example_question_sqls / sample_questions); the API rejects cross-list duplicates with "Duplicate instruction ID '...': first seen in ..., duplicated in ...". - data_sources.tables must be sorted by identifier; example_question_sqls and text_instructions must be sorted by id. (sample_questions is silently re-sorted server-side, so it doesn't enforce.) Plus a simple ID scheme that satisfies both rules in one go: per-list prefix + monotonic counter (1…0001 for sample_questions, 2…0001 for example_question_sqls, 3…0001 for text_instructions). Authoring order = sort order, no collisions. Live-verified. databricks-spark-declarative-pipelines SKILL.md Step 1 now spells out the first-run flow (start-update on a freshly created pipeline, latest_updates is null until then) and gives a null-safe jq snippet so polling doesn't crash on never-run pipelines: (.latest_updates // [{}])[0]. The Updating section drops its duplicate start-update line and points back to Step 1 for the canonical command. All three changes verified live on the workspace. Co-authored-by: Isaac

Brings in 15 commits from main: - Genie Code workspace installer (#88a8470, #3337c3b, #900d965) - Builder-app MCP gateway (#16b9cd7, #c0e41c0, #1522c5a) - Installer additions: Windsurf (#000c834), OpenCode (#a33bebb), interactive-mode detection (#5939e3d), description fix (#61a8c48) - Dashboards 12-column grid migration (#02aac8c) - Security dep bumps (#281d9ac) - AI-functions v2 (#9205f42) - Workspace upload directory exclusions (#ed571a0) - VERSION bump (#1f51d8e) Conflict resolutions: databricks-aibi-dashboards/3-examples.md: Kept our deletion. The file on main still uses the legacy MCP API (get_table_stats_and_schema, execute_sql) we removed earlier this session. 4-examples.md auto-merged with main's 12-col grid tweaks. databricks-aibi-dashboards/SKILL.md: Took main's 12-column widget sizes table and the GRID_V1 quality checklist line. Kept our "tested via CLI" wording (rejected main's regression to "tested via execute_sql"). Updated the layout-grid diagram from 6-col to 12-col while keeping the same three-row shape ([w=12] / [w=4][w=4][w=4] / [w=6][w=6]). install.sh: Standardized on main's is_interactive() helper everywhere. Kept our previous-config persistence layer (HAS_PREVIOUS_CONFIG, SAVED_TOOLS, SAVED_PROFILE, SAVED_SCOPE, SAVED_SKILLS_PROFILE). Merged Windsurf + OpenCode into the saved-tools branch so prior selections for those tools are honored on re-runs. Took main's prompt_mcp_path function verbatim. Kept our extra "skip if not Claude or global scope" guard in prompt_auth. databricks-app-python/4-deployment.md: Auto-merged cleanly. Main's "Excluded directories" subsection sits above our --overwrite paragraph; both stay. bash -n install.sh passes. No conflict markers remain. Co-authored-by: Isaac

calreynolds marked this pull request as draft April 13, 2026 14:14

Quentin Ambard and others added 29 commits April 15, 2026 10:48

Add installation instructions to PDF generation skill

0e34e8a

Add installation section with uv (preferred) and pip fallback for installing databricks-tools-core library. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Fix pdf_generator import path to use scripts/ folder

933271d

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Add testing section to skills README

f1a745c

Documents how to run unit and integration tests for skill scripts. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Quentin Ambard and others added 10 commits April 15, 2026 14:39

QuentinAmbard force-pushed the simplify-skills-remove-mcp branch from 3f0a479 to 23a2495 Compare April 15, 2026 13:10

Quentin Ambard and others added 19 commits April 15, 2026 15:19

Fix comma-space separator in prerequisites list

2284568

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(skills): Replace MCP tools with CLI/SDK and add test infrastructure#464

refactor(skills): Replace MCP tools with CLI/SDK and add test infrastructure#464
QuentinAmbard wants to merge 59 commits intomainfrom
simplify-skills-remove-mcp

QuentinAmbard commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant