Skip to content

[Bug] databricks-aibi-dashboards: queryLines array elements fuse without separator → invalid SQL deployed #501

@dbalistair

Description

@dbalistair

Summary

When building an AI/BI dashboard via the databricks-aibi-dashboards skill, dataset SQL is constructed as an array of strings under datasets[].queryLines. Lakeview concatenates those array elements with empty string (no newline, no space) when it executes the dataset. If any element ends with an identifier and the next element begins with a SQL keyword, the result is invalid SQL with fused tokens.

The skill's reference example (4-examples.md) happens to use trailing spaces on most lines, which sidesteps the problem cosmetically — but the convention is never explained as load-bearing. A reader picks up the structure and naturally writes lines without trailing whitespace, then ships broken SQL.

Reproduction

Build a dataset whose queryLines looks like this:

"queryLines": [
  "SELECT a.Id, a.Name, a.OwnerId",
  "FROM some_catalog.some_schema.some_table a",
  "JOIN other_table h ON a.OwnerId = h.user_id",
  "LEFT JOIN third_table p ON p.user_id = h.user_id",
  "LEFT JOIN fourth_table c ON c.account_id = a.Id",
  "WHERE COALESCE(a.IsDeleted, FALSE) = FALSE"
]

After deployment, fetching the dashboard back via manage_dashboard get and concatenating queryLines with empty string (which is what Lakeview does at execution) yields:

... ON a.OwnerId = h.user_idLEFT JOIN third_table p ON p.user_id = h.user_idLEFT JOIN fourth_table c ON c.account_id = a.IdWHERE COALESCE(a.IsDeleted, FALSE) = FALSE

user_idLEFT, IdWHERE — fused tokens. Dashboard renders with a SQL parse error in every widget that references the dataset.

Lines that happen to end with ,, (, or ) are immune (those characters are token delimiters in SQL). Lines that end with an identifier (alias, column, table) and are followed by a line starting with a keyword (LEFT, WHERE, JOIN, ORDER, GROUP, HAVING, UNION, etc.) are the failure case.

Expected behavior

Either:

  • Lakeview should join queryLines with newlines, OR
  • The skill should prevent broken array shapes from being deployed.

Actual behavior

Skill happily deploys an array of strings whose empty-string concatenation produces invalid SQL. The post-deploy widget error is generic ("Invalid widget definition" / SQL parse error), so the cause isn't obvious from the symptom.

The MANDATORY VALIDATION WORKFLOW in SKILL.md says to test every dataset query via execute_sql before deployment. That step passes — the SQL is valid when assembled in a SQL editor — but the queryLines array form post-join is what actually executes, and that form is never tested.

Suggested fixes (in order of effort)

  1. Docs warning, easy — add an explicit note in SKILL.md and 1-widget-specifications.md:

    WARNING: queryLines array elements are concatenated with empty string. Each element must end with whitespace, a newline, or a token delimiter (, ( )), OR the next element must begin with whitespace. Otherwise identifiers will fuse with adjacent keywords. Safest: pass the entire SQL as a single multi-line string in a one-element array with \n newlines preserved.

  2. Skill validation, medium — in the validation workflow, after building the dataset JSON, do an empty-string concat of queryLines and run that exact string through execute_sql. Catches the fusion case automatically.

  3. MCP tool normalization, robust — in manage_dashboard create_or_update, detect adjacent queryLines elements where last-char + first-char would fuse identifiers (regex: \w$ followed by ^\w) and either:

    • Auto-insert \n between them, OR
    • Always rewrite multi-element queryLines to a single-element array internally.

    This is the most resilient fix because no skill reader has to know the trap exists.

Workaround

Pass the entire SQL as one multi-line string in a one-element queryLines array:

"queryLines": ["WITH cte AS (\n  SELECT ...\n)\nSELECT ...\nFROM cte"]

Lakeview then splits on newlines for storage but preserves them, so the empty-string concat at execution time still yields valid SQL.

Environment

  • Skill: databricks-aibi-dashboards
  • MCP tool: manage_dashboard action create_or_update
  • Workflow followed: get_table_stats_and_schemaexecute_sql (passed) → manage_dashboard create_or_update (deployed broken SQL silently)
  • Detected by inspecting the deployed dashboard via manage_dashboard get and concatenating queryLines manually

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions