Skip to content

feat(appkit): Standard Schema body validation in Plugin.route()#344

Draft
atilafassina wants to merge 14 commits intomainfrom
validate-req-body
Draft

feat(appkit): Standard Schema body validation in Plugin.route()#344
atilafassina wants to merge 14 commits intomainfrom
validate-req-body

Conversation

@atilafassina
Copy link
Copy Markdown
Contributor

@atilafassina atilafassina commented May 5, 2026

Summary

  • Add Standard Schema body validation to Plugin.route(). Routes can declare a body schema (Zod, Valibot, or any spec-compliant validator); requests that fail the schema get a canonical 400 with { error, code, requestId, issues } in dev/override mode and a redacted body in production. Validation runs before plugin-level authentication, so the 400 payload is reachable by anonymous callers — schema disclosure is opt-in via exposeValidationErrors.
  • Apply validation to built-in plugins: analytics (/query/:key), vector-search (filter + query bodies), files (mkdir). Serving validation is deferred via JSDoc until typed schemas are generated.
  • Shared resolveRequestId(req) helper unifies the wide-event logger and the validation wrapper on a single header lookup (x-request-id, x-correlation-id, x-amzn-trace-id) plus a WeakMap-memoized fallback, so the response requestId and the wide-event log's request_id always agree per request — including in the headerless case.
  • Runtime warning at route registration when exposeValidationErrors=true is set in production — operators see the schema-disclosure misconfiguration on first deploy.
  • Fallback request IDs use randomBytes(8).toString('hex') (exactly 16 hex chars, single allocation) instead of randomUUID().slice(0,16) (which produced 14 hex + 2 hyphens with a misleading JSDoc claim of 16 hex).

Test plan

  • Manual: send a request with an invalid body to a route that declares a schema → canonical 400 with issues (dev) or redacted body (prod)
  • Manual: send x-amzn-trace-id: Root=1-...;Parent=...;Sampled=1 → response requestId matches the AWS X-Ray Root segment
  • Manual: send a request with no correlation header → wide-event log's request_id equals the response body's requestId
  • Manual: enable exposeValidationErrors with NODE_ENV=production → operator-facing warning logs at route registration
  • Unit + type: pnpm exec vitest run --project=appkit --project=appkit-ui (1497 passing) + pnpm -r typecheck (clean)

atilafassina and others added 14 commits April 17, 2026 14:24
Extends RouteConfig with optional `body` (Standard Schema v1) and
`exposeValidationErrors` fields. Plugin.route() wraps handlers with a
validation closure when a schema is present; zero-overhead pass-through
otherwise. On validation failure, emits a canonical 400 body
{ error, code: "VALIDATION_ERROR", requestId, issues? } — `issues` is
included in development or when the route opts in; production omits it
by default. A per-request correlation ID is read from x-request-id or
generated, and full issues are logged server-side.

Migrates genie plugin's POST /:alias/messages as the canary: a Zod
schema replaces the manual truthy check on `content`, and the handler
receives a typed `req.body` with no casts.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Replaces the `as IAnalyticsQueryRequest` cast on
POST /api/analytics/query/:query_key with a Zod schema covering
`parameters?` and `format?` (the JSON default now lives in the schema).
The now-unused IAnalyticsQueryRequest, AnalyticsFormat, and
AnalyticsQueryResponse types are deleted; appkit-ui has its own
AnalyticsFormat definition and no external consumer imports the others.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Adds Zod schemas for POST /:alias/query and POST /:alias/next-page,
including a cross-field refinement on the query route that enforces at
least one of queryText or queryVector is present (issue path
["queryText"] for stable client-side rendering). Replaces the manual
truthy checks on both routes.

SearchRequest stays — the programmatic `plugin.query(alias, request)`
exports still contract on it.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Ports the mkdir-specific path rules (non-empty, max 4096 chars, no
null bytes) into a Zod schema refinement on POST
/api/files/:volumeKey/mkdir. The manual `typeof req.body?.path` check
and the in-handler `_isValidPath` call for mkdir are deleted.

`_isValidPath` itself stays intact — it still guards the GET query-
based routes and the upload route, which migrate to schema validation
in a separate follow-up PR.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Adds a JSDoc block above ServingPlugin explaining why serving does not
use the new RouteConfig.body mechanism — its body shape is resolved
dynamically per-alias via filterRequestBody() against disk-cached
endpoint schemas, not statically declarable as a RouteConfig field.

The comment includes a TODO(serving-validation-follow-up) marker and a
placeholder for the tracking GitHub issue URL (to be filled in once
the issue is opened).

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Addresses four confirmed blockers and three nits surfaced by the
multi-model debate review:

- Wrap schema.validate() in try/catch — thrown refinements and async
  rejections now return a canonical 500 with { error, code:
  "VALIDATION_INTERNAL_ERROR", requestId } instead of escaping to a
  stack-leaking 500. Thrown message is never reflected; only errorName
  is logged server-side.
- Add size caps to every Zod schema landed by this PR (genie content/
  conversationId, analytics parameters key count, vector-search
  queryText/queryVector/columns/numResults/filters/pageToken). Closes
  the DoS surface where a 10MB body would be fully parsed before
  rejection.
- Sanitize x-request-id: enforce /^[A-Za-z0-9_-]{1,100}$/ on the
  incoming header; discard and regenerate if it fails. Kills the CRLF
  log-injection (CWE-117) and reflection vectors.
- Cap validation issues at 20 during response + logging. Adds
  `issuesTruncated: true` flag on responses when more existed (dev
  mode / override only). Server-side logs omit issue.message and only
  record paths + counts.
- nit: remove redundant path:["path"] from files mkdir refinement
  (field-level refine already emits at that path).
- nit: fail-fast schema check at route() registration — bad body
  schemas now throw at plugin startup, not first request.
- nit: switch requestId fallback from randomBytes(8) to
  randomUUID().slice(0,8) (V8-optimized, no Buffer alloc).

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Adds two overloads on Plugin.route():
- When `body: TSchema` is present, TBody is inferred from
  StandardSchemaV1.InferOutput<TSchema>. Plugin authors no longer
  write `<TBody>` explicitly — the compiler derives it from the
  schema.
- When `body` is absent, TBody is `unknown`.

This closes the type hole where a plugin could declare
`this.route<{ foo: string }>({ body: z.object({ bar: z.number() }) })`
and have the declared TBody silently diverge from the schema's actual
output type. Runtime was always safe; now the compile-time types are
too.

Removes the now-redundant explicit `<TBody>` at all four migrated
plugin call sites (genie, analytics, vector-search query + next-page,
files/mkdir).

The RouteConfig<TBody = any> default stays as-is — it's load-bearing
for backward compat with handlers typed as plain express.Request.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…257)

* docs: add plugin best-practices reference guide

Extracts patterns from the 5 core plugins (analytics, genie, files,
lakebase, server) into a structured reference with 9 sections and
three severity tiers (NEVER/MUST/SHOULD).

Signed-off-by: Atila Fassina <atila@fassina.eu>

* docs: integrate best-practices reference into create-core-plugin skill

Adds a "Load Best Practices Reference" step before scaffolding
decisions so guidelines are applied during plugin creation.

Signed-off-by: Atila Fassina <atila@fassina.eu>

* fix: correct inaccuracies in plugin best-practices reference

Fixes 3 issues found during dry-run validation against analytics
and files plugins:
- Plugin extends Plugin (not Plugin<TConfig>)
- @internal goes on toPlugin export, not the class
- isInUserContext() patterns clarified per actual usage

Signed-off-by: Atila Fassina <atila@fassina.eu>

* docs: add review-core-plugin and audit-core-plugin skills

Signed-off-by: Atila Fassina <atila@fassina.eu>

* fix: address dry-run findings in audit and review skills

- Add cacheKey two-stage pattern guidance to prevent false positives
- Add N/A status option for non-applicable scorecard categories
- Clarify connector scope in structural completeness check
- Add deduplication guidance for cross-category findings
- Add recovery hint to plugin-not-found error message

Signed-off-by: Atila Fassina <atila@fassina.eu>

* fix: address review feedback on plugin skills

- Remove trailing slash from route prefix example in best-practices
- Use deterministic directory-existence check instead of name-pattern
  heuristic for plugin vs branch disambiguation in review-core-plugin
- Downgrade defaults.ts from MUST to SHOULD in audit-core-plugin since
  not all plugins (e.g. server, lakebase) require it

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>

* chore: address code review

* refactor: extract shared plugin review guidance to reduce duplication

Move category index, severity ordering, deduplication rules, and
cache-key tracing instructions into a shared reference file so audit
and review skills stay in sync from a single source of truth.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>

* docs: clarify getResourceRequirements patterns in plugin best practices

Split the rule into config-gated flip (optional→required) and dynamic
discovery (required placeholder + dynamic emission) variants. Reference
the custom-plugins.md caching example for the flip pattern and
FilesPlugin/ServingPlugin for the discovery pattern.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>

* docs: clarify scorecard numbering in audit-core-plugin

Add a note that category 0 (Structural Completeness) is a pre-check
with no matching section in plugin-best-practices.md, while 1–9 mirror
the best-practices sections.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>

---------

Signed-off-by: Atila Fassina <atila@fassina.eu>
…hared/ (#279)

* fix: add AST extraction to serving type generator and move types to shared/

The CLI generate-types command was falling back to the DATABRICKS_SERVING_ENDPOINT_NAME
env var instead of extracting endpoint aliases from server.ts, producing a single
"default" key instead of the actual aliases (e.g., "first", "second", "third"). This
causes build failures when client code references typed endpoint aliases.

Additionally, generated type declarations are moved from client/src/appkit-types/ to
shared/appkit-types/ so both server and client tsconfigs can see the ServingEndpointRegistry
module augmentation (server tsconfig already includes shared/**/*).

Co-authored-by: Isaac

* fix: update serving vite-plugin test for shared/ output path

The outFile path changed from client/src/ to shared/ but the test
assertion was not updated.

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

* fix: update gitignore for shared/ types and add tsc to template server build

- Move serving types gitignore from client/src/ to shared/ (dev-playground + template)
- Add old file cleanup in generator to prevent duplicate module augmentation
- Add tsc -b to template build:server for type-checking parity with build:client
- Simplify template typecheck script to use tsc -b consistently

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

* fix: add old file cleanup for analytics types too

Mirror the serving types cleanup — remove stale
client/src/appkit-types/analytics.d.ts after generating at the new
shared/ location.

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

* fix: deduplicate migration cleanup, add error handling, remove dead code

- Extract removeOldGeneratedTypes into migration.ts to avoid duplication
  and circular dependency between index.ts and serving/generator.ts
- Add try-catch to resolveEndpointsFromServerFile so AST parse failures
  fall back gracefully to env var
- Remove unused root variable in analytics vite-plugin
- Clean up template tsconfig.server.json: remove dead emit options
  (outDir, declaration, declarationMap, sourceMap) since noEmit is set

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

* feat: make serving alias optional for single-endpoint registries

ServingFactory now adapts based on endpoint count:
- Empty registry: alias optional, untyped
- Single key: alias optional, fully typed (defaults to the only endpoint)
- Multiple keys: alias required for disambiguation

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

* fix: register named routes in unnamed serving mode for type-generated clients

When serving() is called without explicit endpoints (unnamed mode), the
type generator still creates a "default" alias. Client hooks then
construct /api/serving/default/invoke, but the server only had /invoke
registered — causing a 404.

Now unnamed mode registers both /invoke and /:alias/invoke (same for
stream) so both URL patterns work.

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

* feat: auto-migrate project configs for shared/ types output path

When type generation runs, automatically patch existing apps' config
files to match the new shared/appkit-types/ output location:

- tsconfig.client.json: add "shared/appkit-types" to include
- tsconfig.server.json: switch from emit mode to noEmit
- package.json: update build:server and typecheck scripts

Each sub-migration is idempotent (content-based detection, no marker
files). Opt-out via "appkit": { "autoMigrate": false } in package.json.
Vite plugins now pass projectRoot explicitly to generators.

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

* fix: improve migration robustness, add type tests, fix stale CLI help

- Log warnings on migration sub-function failures instead of silent catch
- Track migrated projects per-root (Set) instead of global boolean flag
- Validate resolveProjectRoot output by checking for package.json
- Add compile-time type tests for IsUnion and ServingFactory conditional
- Update CLI help text examples to reference shared/appkit-types/ paths

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

* fix: clean up pre-release type files (appKitTypes.d.ts, appKitServingTypes.d.ts)

Target the actually shipped filenames for old type cleanup instead of
the intermediate appkit-types/ directory that was never released.
Fix outdated JSDoc comments referencing old paths.

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

* docs: update stale type file path references to shared/appkit-types/

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

---------

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
Signed-off-by: databricks-appkit[bot] <databricks-appkit[bot]@users.noreply.github.com>
Signed-off-by: databricks-appkit[bot] <databricks-appkit[bot]@users.noreply.github.com>
Six blockers from the round-2 multi-model debate:

- Restore body-less generic route<T>() overload — apps/dev-playground/
  server/reconnect-plugin.ts uses `this.route<ReconnectResponse>(...)`
  with no body schema. Adding a `route<TBody>(router, config & { body?:
  undefined })` overload (ordered first so explicit generics route
  there) lets the legacy call form keep compiling.

- Strict discriminated-union check on validator results. A validator
  returning `{ issues: undefined }` or `{}` no longer falls through to
  the success path with `req.body = undefined`; it routes to the
  canonical 500 (VALIDATION_INTERNAL_ERROR) without calling the handler.

- Bound vector-search filter sub-structures: keys ≤255, string values
  ≤1024, nested arrays ≤100 entries with each element bounded. Closes
  the round-1-deeper-layer DoS surface where 50-key filters could
  smuggle a 99KB string value past the schema cap.

- Tighten analytics `parameters` value type to a constrained union of
  JSON primitives + the SQLTypeMarker shape. The marker schema is
  `.strict()` so callers can't smuggle extra fields; `value` is capped
  at 4096 chars. Both `sql.*(...)` markers and bare primitives are
  accepted; arbitrary nested objects are rejected.

- Strong JSDoc warning on RouteConfig.exposeValidationErrors explaining
  that body validation runs BEFORE plugin-level auth (which lives
  inside the handler via asUser(req)), so enabling the flag in
  production exposes schema structure to anonymous attackers. Plus a
  new "When to opt into exposeValidationErrors" subsection in the
  plugin authoring docs.

- Fix docs examples that used the now-redundant `<SendMessageBody>`
  generic on this.route() — the schema-bearing overload infers TBody
  automatically.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Three small cleanups from the round-2 review's "judgment call" tier,
plus a sanitizer-sharing refactor that fixes a real correlation drift:

- Lift `sanitizeRequestId` into a shared module
  (`packages/appkit/src/logging/request-id.ts`). Both
  `Plugin.route()`'s `resolveRequestId` and the wide-event logger
  now share one allowlist + cap. Previously the two disagreed on
  what counted as a valid `x-request-id`, so a client-visible
  requestId from a 400 response could not be matched against the
  server-side wide-event log.
- Unified regex: `/^[A-Za-z0-9][A-Za-z0-9_.-]{0,127}$/`. Accepts dots
  (W3C traceparent compatibility), requires alphanumeric start
  (preventing `--rf`-style values from being misinterpreted as flags
  if a requestId ever flows into a shell pipeline), max length 128.
- Widen fallback ID entropy from 32 bits (`randomUUID().slice(0, 8)`)
  to 64 bits (`.slice(0, 16)`). Birthday-collision boundary moves
  from ~65k IDs to ~4B.
- Delete the misleading "AppKitError-compatible path so sensitive
  values inside context are redacted" comment in the validator-throw
  catch path. The logger has no redactor; only `errorName` is logged
  today, but the comment was a trap for future maintainers. Replaced
  with "no redactor; do not pass user-supplied values."

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Four findings from the round-3 multi-model debate, plus follow-up
hardening from a round-4 review.

- Strict spec-compliant validator-result branching: any array-valued
  `issues` field is now classified as failure (per Standard Schema v1
  spec — `FailureResult` permits empty `issues`). Round-2 incorrectly
  treated `{ issues: [] }` as success; this is now a canonical 400 with
  an empty `issues` array in dev / override responses. Mixed shapes
  `{ value, issues: [] }` also route to failure (issues present wins).
  Malformed shapes (`{}`, `{ issues: undefined }`, non-objects, AND
  `{ value, issues: <non-array> }`) route to canonical 500 — the
  500-gate is `hasIssuesField || !hasValueField` so a present-but-
  non-array `issues` cannot slip past `Array.isArray` into the
  success path.

- Drop JSON primitives from analytics `parameters` schema. Primitives
  passed schema validation but failed late inside QueryProcessor via
  SSE — clients got an opaque stream error instead of the canonical
  400. Schema now accepts only `SQLTypeMarker | null`. Callers must
  wrap raw values with `sql.string()`, `sql.number()`, etc. Cast on
  the validated body simplified accordingly.

- Runtime warning at route registration when `exposeValidationErrors`
  is true and `NODE_ENV === "production"`. Operators see the
  schema-disclosure misconfiguration on first deploy instead of
  relying on JSDoc alone.

- Shared `resolveRequestId(req)` helper in `logging/request-id.ts`,
  consulting headers in order: `x-request-id`, `x-correlation-id`,
  `x-amzn-trace-id`. Both `Plugin.route()` and the wide-event logger
  now use the same helper. The resolver memoizes per-request via a
  `WeakMap<object, string>` so the headerless fallback path returns
  the same ID across the four call sites (logger.ts plus three
  branches in plugin.ts) — without memoization each call generated
  a fresh fallback and the wide-event `request_id` no longer matched
  the response `requestId`. The `x-amzn-trace-id` header now parses
  the `Root=` segment out of the AWS X-Ray format
  (`Root=1-...;Parent=...;Sampled=1`) before sanitizing, so AWS
  callers actually benefit from the consultation instead of falling
  through to a generated fallback. Generated fallbacks use
  `randomBytes(8).toString("hex")` for exactly 16 hex chars (~64
  bits) — no embedded UUID hyphens, single allocation.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants