Skip to content

fix: unblock SP calls for UC in production#310

Open
atilafassina wants to merge 19 commits intomainfrom
sp-files
Open

fix: unblock SP calls for UC in production#310
atilafassina wants to merge 19 commits intomainfrom
sp-files

Conversation

@atilafassina
Copy link
Copy Markdown
Contributor

@atilafassina atilafassina commented Apr 23, 2026

Add on-behalf-of-user (OBO) support to the files plugin so SDK calls run as the end user, unblocking Unity Catalog volume access for OBO apps deployed on Databricks Apps.

What changed

Per-volume opt-in to OBO via a new auth field on each volume config:

files({
  volumes: {
    user_uploads: { auth: "on-behalf-of-user", policy: usersOnly() },
    shared:       { auth: "service-principal" }, // default; existing behavior
  },
})

When auth: "on-behalf-of-user":

  • HTTP routes require x-forwarded-user + x-forwarded-access-token (401 in
    production if either is missing; dev mode falls back to SP identity with a
    warning).
  • The identity passed to volume policies is the end user (not the SP).
  • SDK calls execute via runInUserContext so UC sees the user, not the SP.
  • files.auth_mode span attribute distinguishes OBO vs SP traffic in
    telemetry.
  • Programmatic appKit.files("vol").asUser(req).read(...) works on any volume
    (forces OBO regardless of the volume's default auth mode).

Note

OBO read tier intentionally disables caching (the cache layer keys by user; a write by user A wouldn't invalidate user B's view of the same path). Documented in the code; future work tracked for per-(volume, path) generation counters.

What's NOT changed

  • Default volume behavior is unchanged — existing apps keep auth: "service-principal" semantics with no migration.
  • The static manifest still grants WRITE_VOLUME to the SP for every volume. OBO volumes need the user to also have UC permission, which has to be communicated out-of-band until the manifest schema gains a per-volume auth scope. (TODO noted in getResourceRequirements.)

Hardening

  • Upload Content-Length parsing (strict regex, rejects negative / partial /
    array values; stream-level byte counter as the real gate).
  • Error-message hardening: _handleApiError no longer reflects raw SDK error.message on 4xx (CWE-209); generic body, raw to server log.
  • /read route is atomic: drains the body into a memory buffer up to maxReadSize; on overflow responds 413 atomically (no partial 200 body leak).
  • List-cache invalidation is symmetric: root-level writes invalidate both "__root__" and connector.resolvePath("/") so ?path=/ listings stay fresh.
  • _extractObiUser is strict in production (both headers required), warning + SP fallback only in dev.
  • _invalidateListCache is awaited before the HTTP success response so a follow-up GET /list in the same tick can't observe stale data.

Tests

  • 113 plugin tests + 28 integration tests + 33 helpers/policy
  • New tests cover: OBO read/write happy path + 401 paths, OBO write cross-user freshness (cache disabled), asUser() programmatic path, files.auth_mode span attribute on every code path, root-level cache invalidation key parity.

Test plan

  • Unit tests in packages/appkit/src/plugins/files/tests/
  • Integration tests against a real Express app (plugin.integration.test.ts)
  • pnpm -r typecheck

atilafassina and others added 16 commits April 23, 2026 15:03
Phase 1 of 2 — remove pre-policy 401 on missing x-forwarded-user; let the
volume policy decide via { id: <sp-id>, isServicePrincipal: true }.
asUser(req) keeps strict throw semantics; single logger.debug on fallback
replaces the dev-mode logger.warn. Startup no-explicit-policy warning
broadened to mention header-less HTTP. Tests rewritten to codify new
contract. Two pre-existing auto-generated files reformatted to match biome.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Phase 2 of 2 — non-behavioral polish. JSDoc on FilePolicyUser.id /
isServicePrincipal broadened to describe header-less HTTP as a valid
SP call origin. VolumeHandle JSDoc notes asUser(req) throws
AuthenticationError.missingToken regardless of NODE_ENV. Files-plugin
docs paragraphs that implied x-forwarded-user was mandatory have
been reworded. Auto-regenerated typedoc for FilePolicyUser included.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Adds optional `auth: "service-principal" | "on-behalf-of-user"` to both
VolumeConfig and IFilesConfig. Resolution order:
volume.auth ?? plugin.auth ?? "service-principal".

Removes the undocumented `bypassPolicy` parameter from createVolumeAPI
(zero callsites in packages/ or apps/, so no consumers to migrate).

Phase 1 of files-per-volume-auth-mode — config surface only, no routing
or SDK identity change yet. _resolveAuth is unused at this point and
will be wired in Phase 2.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Adds `_extractObiUser` to the files plugin and wires identity selection
into `_enforcePolicy` so it branches on `_resolveAuth(volumeKey)`:
- service-principal volumes: existing inline extraction (unchanged)
- on-behalf-of-user volumes: require x-forwarded-access-token + user
  headers; 401 in production when missing, dev-fallback to SP with a
  single warn in development

Only the policy-user identity changes here. UC SDK calls still execute
as the service principal until Phase 3 wires `runInUserContext`.

Phase 2 of files-per-volume-auth-mode.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Wires the seven read handlers (list, read, download, raw, exists,
metadata, preview) through a new `_runWithAuth(req, volumeKey, fn)`
helper that:
- runs `fn` directly on SP volumes — byte-for-byte identical to today
- wraps `fn` in `runInUserContext` on OBO volumes when both x-forwarded
  headers are present, so the SDK call and `getCurrentUserId()` resolve
  to the end user's identity. Cache keys use `getCurrentUserId()`, so
  per-user cache isolation falls out for free.

Policy still gates first; `_enforcePolicy` already 401s in production
when OBO headers are missing, so the dev-fallback path inside
`_runWithAuth` is only reachable under `NODE_ENV === "development"`.

Also rewrites the cache invalidation comment and the `VolumeAPI` /
`VolumeHandle` / `exports()` JSDoc to describe both modes (previously
all three claimed "operations execute as the service principal").

Phase 3 of files-per-volume-auth-mode — first end-to-end demoable
slice. Write routes (Phase 4) and `VolumeHandle.asUser` SDK identity
(Phase 5) are not yet wired.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Wires `_handleUpload`, `_handleMkdir`, and `_handleDelete` through
`_runWithAuth` so write traffic on OBO volumes executes as the end
user. SP volumes are byte-for-byte identical (no-op wrap fall-through).

The upload handler does a hand-rolled `fetch PUT` and calls
`client.config.authenticate(headers)` to populate auth headers — the
whole body now runs inside the user-context wrap, so the user-token
client is what authenticates the outgoing request.

Adds the non-negotiable upload-headers contract test: triggers an OBO
upload and asserts the outgoing fetch carries `Authorization: Bearer
USER-TOKEN-FOO` (not the SP's marker), proving the chain
runInUserContext → getWorkspaceClient → client.config.authenticate →
fetch headers is intact end-to-end. Verified the test fails meaningfully
if the wrap is removed.

Phase 4 of files-per-volume-auth-mode. Phase 5 (`VolumeHandle.asUser`
SDK identity fix) is next.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
`appKit.files("vol").asUser(req).<op>()` previously only swapped the
user identity passed to the volume policy — the underlying SDK call
still ran as the service principal. After this change, every method on
the returned handle is wrapped in `runInUserContext` with the request's
user identity, so the UC SDK call genuinely executes as the end user.
This is a hard override at the SDK level: it applies even on
`auth: "service-principal"` volumes, where it was previously a no-op
for SDK identity.

Implementation: a new `_wrapVolumeAPIInUserContext` helper wraps each
method in the returned `VolumeAPI`. Reuses `_buildUserContextOrNull` so
the dev-mode fallback (`NODE_ENV === "development"` + missing token)
skips the wrap and continues to execute as the SP — matching pre-OBO
behavior locally without a reverse proxy.

Strict `_extractUser` is unchanged — `asUser` still throws
`AuthenticationError.missingToken` in production when the user header
is missing.

Programmatic OBO note: `appKit.files("obo-vol").<op>()` (no req, no
asUser) cannot synthesize a user identity and continues to execute
against whatever client `getWorkspaceClient()` resolves to at the call
site (typically the SP at the top level). The OBO volume default
applies to HTTP route traffic via `_runWithAuth`. For programmatic
per-user execution, `asUser(req)` is the supported path.

Phase 5 of files-per-volume-auth-mode.

BREAKING CHANGE: programmatic callers of
appKit.files("vol").asUser(req).<op>() that expected the SDK call to
execute as the service principal (with only the policy seeing the
user) now see the SDK call execute as the user. Audit programmatic
asUser callers and remove the asUser wrap if SP credentials were the
desired behavior.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…phase 6)

Tags every Files plugin trace span with `files.auth_mode` set to either
`"service-principal"` or `"on-behalf-of-user"`, reflecting what
operationally happened (whether the SDK call ran inside
`runInUserContext`). Two complementary plumbing paths land on the same
attribute key:

- HTTP routes: route handlers compute the effective mode via a new
  `_effectiveAuthMode(req, volumeKey)` helper and thread it into the
  existing `PluginExecutionSettings.telemetryInterceptor.attributes`
  shape — `TelemetryInterceptor` already passes those attributes to
  `tracer.startActiveSpan`, so the attribute lands on the existing
  `plugin.execute` span without new infrastructure.
- Programmatic API (`exports()` SP path + `asUser` OBO path): wraps each
  VolumeAPI method in a new `_withAuthModeSpan(operation, mode, fn)`
  helper that calls `this.telemetry.startActiveSpan("files.<op>", ...)`.
  Programmatic calls previously created NO span, so this introduces new
  `files.<op>` spans on programmatic surface — a small observability
  enrichment beyond pure attribute plumbing. Spans respect the plugin's
  `traces` config; the noop tracer takes over when traces are disabled.

Updates `getResourceRequirements()` JSDoc to document the per-volume
permission split: SP volumes need the SP to hold `WRITE_VOLUME`; OBO
volumes need the end user (not the SP) to hold it. Adds a
`// TODO: extend plugin-manifest.schema.json` marker for the eventual
schema-level fix.

Phase 6 of files-per-volume-auth-mode. Phase 7 (docs, playground,
changelog) is next.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Final polish phase for files-per-volume-auth-mode:

JSDoc:
- Expand `VolumeConfig.auth` and `IFilesConfig.auth` with SP and OBO
  example blocks plus resolution-order notes.
- Extend `FilePolicyUser.isServicePrincipal` JSDoc with the full
  six-row matrix covering SP/OBO x HTTP/programmatic x header
  presence, including the `asUser(req)` rows.

Docs (`docs/docs/plugins/files.md`):
- New "Auth modes" section: two modes, resolution order, per-mode
  permission requirements, prod/dev OBO behavior, manifest-scope
  limitation, side-by-side config examples.
- New `usersOnly` policy example using `isServicePrincipal: false`
  and a "Policy user matrix" subsection.
- Rewrites every "all operations execute as the service principal"
  paragraph to describe both modes.
- Dedicated `asUser(req)` subsection with the honest limitation that
  programmatic OBO defaults don't apply without `asUser`.

Dev-playground:
- Server: new `obo_demo` volume with `auth: "on-behalf-of-user"` and
  a `usersOnly` policy; new smoke route `GET /policy/obo-volume`
  using `asUser(req)`. `app.yaml` gets the matching
  `DATABRICKS_VOLUME_OBO_DEMO` resource binding.
- Client: `policy-matrix.route.tsx` gets a "Per-volume OBO mode"
  section with two probes (HTTP route + programmatic smoke) so the
  end-to-end OBO path is exercisable from the UI.

Changelog:
- New `## Unreleased` section in `CHANGELOG.md` (lives above the
  release-it generated `[0.24.0]` block) covering the per-volume auth
  feature, the `files.auth_mode` span, the `asUser` SDK identity
  behavior change, the `bypassPolicy` removal, and the honest
  limitation about programmatic OBO without `asUser`.

Generated:
- `types.generated.ts` and `plugin-manifest.generated.ts` are
  regenerated formatting (line-collapsing) from `pnpm build`'s
  generator step. Content unchanged.

Backpressure: pnpm build, pnpm docs:build, pnpm check:fix,
pnpm -r typecheck, pnpm test (1666 passing) all clean.

Phase 7 of files-per-volume-auth-mode — feature is shippable.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
…che, spans

Addresses four findings from the multi-model review of the files OBO
feature.

1. asUser privilege confusion in production (CRITICAL, security)
   `_extractUser` now requires BOTH `x-forwarded-user` AND
   `x-forwarded-access-token` in production — throws
   `AuthenticationError.missingToken` when either is missing. Previously
   a request with the user header but no token returned a SP-wrapped
   API where the policy saw the request as a real end user
   (isServicePrincipal: undefined) but the SDK ran with SP credentials,
   so policies like `usersOnly: !user.isServicePrincipal` were
   satisfiable while wielding the SP's broader UC grants. CWE-639.
   Dev fallback now marks the returned policy user as
   `isServicePrincipal: true` and warns (was debug) so policies can't
   be tricked even in dev.

2. Double WorkspaceClient allocation per OBO request (HIGH, perf)
   New `_resolveAuthForRequest(req, volumeKey)` returns
   { mode, userCtx } and builds the UserContext at most once per
   request. `_runWithAuth(userCtx, fn)` now takes the pre-built ctx.
   Removes `_effectiveAuthMode` (no longer needed). Every OBO HTTP
   request previously constructed two WorkspaceClients; cache hits
   paid that cost too. Now: one allocation, even on cache hits.

3. Write-cache invalidation on OBO + wrong path arg (HIGH, correctness)
   Two related bugs at `_invalidateListCache`: (a) cache keys included
   `getCurrentUserId()`, so user A's write only busted user A's list
   cache — user B continued to see stale listings; (b) the `path` arg
   was the file path instead of the parent directory.

   Fix: list-cache is disabled on OBO volumes (no cross-user staleness
   possible). On SP volumes, invalidation now derives the parent
   directory via `parentDirectory()`, falling back to the `"__root__"`
   sentinel for root-level writes — matches `_handleList`'s key shape.
   Cache delete + path resolution are wrapped in best-effort try/catch
   so an invalidation failure cannot convert a successful write into
   HTTP 500.

4. Duplicate `files.<op>` spans on programmatic calls (HIGH, perf)
   `FilesConnector.<op>` already opens a `files.<op>` span via its
   `traced()` decorator. The Phase 6 `_withAuthModeSpan` opened an
   identical span on top, doubling span allocation/export volume on
   every programmatic call.

   Fix: new `runWithFilesSpanAttributes(attrs, fn)` exported from
   `connectors/files/client.ts` uses AsyncLocalStorage to make ambient
   attributes available to the connector's `traced()` decorator.
   `_withAuthModeSpan` renamed to `_withAuthModeAttributes` — no longer
   creates a span; just sets `files.auth_mode` on the connector's
   existing span via the ALS channel. Programmatic calls now produce
   exactly one `files.<op>` span. HTTP route attribute path
   (`PluginExecutionSettings.telemetryInterceptor.attributes`) is
   unchanged.

Tests: 8 new regression guards covering each fix's failure mode (1674
total, was 1666). The pre-existing `DownloadResponse` unused-import
warning is also resolved as cleanup fallout.

User-facing notes (for changelog follow-up):
- asUser(req) throws on missing token in production (was silent SP
  fallback — the previous behavior was a privilege-confusion bug).
- OBO volume reads are no longer cached. Trade-off: every OBO read
  hits the SDK; in exchange, no cross-user staleness. SP volume reads
  still cache.
- Programmatic appKit.files(vol).<op>() emits one `files.<op>` span
  instead of two. The `files.auth_mode` attribute now lands on the
  connector's existing span.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Two follow-up review fixes on top of 3f98628.

A — _invalidateListCache not awaited (medium, correctness)
   Write handlers (_handleUpload, _handleMkdir, _handleDelete) called
   _invalidateListCache without `await`, so res.json() shipped before
   cache invalidation completed. A client that issued a follow-up
   GET /list in the same tick could hit the stale cache.

   Fix: _invalidateListCache is now genuinely `async` and all three
   call sites `await` it before sending the response. Best-effort
   try/catch around cache.delete and connector.resolvePath remains, so
   an invalidation failure cannot convert a successful write into 500.

   Regression test installs a deferred-promise cache.delete and
   asserts res.json is NOT called until the delete settles. Verified
   the test fails when the `await` is removed and passes when it is
   added back.

B — hardcoded ports in integration tests (medium, CI hygiene)
   plugin.integration.test.ts bound to fixed offsets of TEST_PORT
   (9880, 9881, 9882). Concurrent CI runs or stale test processes
   risked EADDRINUSE flakiness.

   Fix: bind `port: 0` so the OS assigns ephemeral ports. New
   getListeningPort() helper waits for the server's `listening` event
   and returns the assigned port for building localBase URLs. Chose
   port-0 over supertest because the tests build their own AppKit
   instance, hold the Server for afterAll cleanup, and use real fetch
   calls — moving to supertest would have required restructuring the
   fixture lifecycle.

Tests: 1675 (was 1674 + 1 new). typecheck + biome clean.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>

# Conflicts:
#	CHANGELOG.md
#	apps/dev-playground/server/index.ts
- _readPathQuery helper rejects array/object query params with 400 across
  all 7 path-bearing handlers (was: req.query.path raw-cast)
- /read streams via connector.download + size-enforcing TransformStream
  capped at maxReadSize (new VolumeConfig field, default 10MB); /read no
  longer participates in the read-tier cache
- Upload TransformStream enforces bytesReceived <= contentLength when
  declared, closing a per-user policy bypass where small Content-Length
  + larger body would exceed an approved upload size
- _enforcePolicy 401 and _handleApiError 4xx now return generic public
  bodies (Unauthorized / standard HTTP STATUS_CODES); raw error.message
  goes to server-side logs only (CWE-209)

Co-authored-by: Isaac

Co-authored-by: Orca <help@stably.ai>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Two HIGH correctness findings from multi-model review of sp-files vs main:

1. /read could leak a partial 200 body before triggering 413. Once
   nodeStream.pipe(res) began, headers were sent and the size-cap
   fallback became res.destroy() mid-stream — breaking the all-or-nothing
   contract. Replaced the pipe with a bounded getReader() drain into
   chunks; on overflow we respond 413 atomically (and reader.cancel()
   the upstream), otherwise res.send(Buffer.concat(chunks, bytesRead))
   ships the body in one shot. Memory still bounded by maxReadSize.

2. _invalidateListCache deleted only "__root__" for root-level writes,
   but _handleList keys ?path=/ listings under connector.resolvePath("/").
   A client listing via ?path=/ saw stale data after a sibling write.
   Invalidator now deletes both keys (with try/catch around resolvePath
   and a no-op when it equals the sentinel).

113 plugin tests + 28 integration tests + workspace typecheck green.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
@atilafassina atilafassina changed the title feat: unblock SP calls for UC in production fix: unblock SP calls for UC in production May 4, 2026
@atilafassina atilafassina marked this pull request as ready for review May 4, 2026 21:41
@atilafassina atilafassina requested a review from a team as a code owner May 4, 2026 21:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds per-volume authentication mode support to the Files plugin so Unity Catalog SDK calls can execute either as the service principal (default) or on-behalf-of-user (OBO) using forwarded user headers, unblocking UC volume access for OBO apps in production and improving telemetry + request hardening.

Changes:

  • Introduces auth configuration at both plugin (IFilesConfig.auth) and volume (VolumeConfig.auth) levels, wiring HTTP routes and VolumeHandle.asUser(req) through runInUserContext for OBO execution.
  • Adds files.auth_mode telemetry tagging across HTTP + programmatic code paths (implemented via AsyncLocalStorage attribute propagation into connector spans).
  • Hardens HTTP behavior: stricter header/token requirements in production, safer error responses, upload Content-Length enforcement, and /read response-size limiting via maxReadSize.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
packages/appkit/src/plugins/files/types.ts Adds new config fields (auth, maxReadSize) and updates public API docs to reflect SP vs OBO execution semantics.
packages/appkit/src/plugins/files/policy.ts Expands FilePolicyUser docs to clarify id/isServicePrincipal behavior across auth modes and call paths.
packages/appkit/src/plugins/files/plugin.ts Implements auth resolution, OBO context wiring (runInUserContext), telemetry attributes, query parsing hardening, cache behavior changes, and updated error handling.
packages/appkit/src/connectors/files/client.ts Adds ALS-based span attribute propagation (runWithFilesSpanAttributes) merged into connector spans.
packages/appkit/src/plugins/files/tests/plugin.test.ts Adds extensive unit coverage for auth inheritance, OBO identity selection, cache behavior, invalidation timing, and telemetry attributes.
packages/appkit/src/plugins/files/tests/plugin.integration.test.ts Improves integration test reliability by using ephemeral ports and adds coverage for header-less SP-mode behavior.
docs/docs/plugins/files.md Documents per-volume auth modes, policy-user matrix, and programmatic asUser(req) semantics.
docs/docs/api/appkit/Interface.FilePolicyUser.md Updates generated API docs for FilePolicyUser fields and behavior matrix.
CHANGELOG.md Adds an Unreleased entry describing new auth modes, telemetry attribute, behavior changes, and limitations.
apps/dev-playground/server/index.ts Adds an OBO demo volume + probe route to validate end-to-end OBO behavior.
apps/dev-playground/client/src/routes/policy-matrix.route.tsx Adds UI smoke tests/panels for the per-volume OBO demo.
apps/dev-playground/app.yaml Adds env wiring for the OBO demo volume.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/appkit/src/plugins/files/plugin.ts
Comment thread packages/appkit/src/plugins/files/plugin.ts Outdated
Comment thread docs/docs/plugins/files.md Outdated
Comment thread packages/appkit/src/plugins/files/plugin.ts Outdated
Comment thread docs/docs/plugins/files.md Outdated
… msg, docs

Four findings from the GitHub Copilot review on PR #310:

1. _invalidateListCache missed absolute UC root paths. A write to
   /Volumes/<c>/<s>/<v>/foo.txt has parentDirectory("/Volumes/<c>/<s>/<v>"),
   which the prior code didn't recognize as root-level — the rootless
   "__root__" listing stayed stale. Now the invalidator computes the
   volume root via connector.resolvePath(""), recognizes it as
   root-level, and invalidates all three keys "__root__", <volumeRoot>,
   <volumeRoot>/. Replaces the previous resolvePath("/") attempt that
   was a no-op (resolvePath rejects "/" since it's not /Volumes/...).

2. _extractObiUser threw "Missing x-forwarded-access-token" in every
   non-(token && userId) production case, so a request with only the
   token (but no x-forwarded-user) got a misleading server-side error.
   Now distinguishes: !token throws "missing token", token-but-no-userId
   throws "missing user header". HTTP body remains generic "Unauthorized"
   either way.

3. docs/plugins/files.md said OBO volumes get per-user cache isolation
   automatically. Code disables read/list cache entirely on OBO
   (cross-user invalidation isn't possible with the current key scheme).
   Doc updated to match.

4. docs/plugins/files.md said asUser(req) throws when x-forwarded-user
   is absent. Code requires BOTH x-forwarded-user AND
   x-forwarded-access-token in production. Doc updated at both
   occurrences.

Tests: existing root-level invalidation test rewritten as a
parameterized matrix covering both relative ("newdir") and absolute
("/Volumes/.../newdir") root-level writes; both must invalidate
__root__ + <volumeRoot> + <volumeRoot>/. 188 plugin tests + workspace
typecheck green.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Comment thread packages/appkit/src/plugins/files/plugin.ts Outdated
@MarioCadenas
Copy link
Copy Markdown
Collaborator

Agentic review — performance & placement notes (Xavier panel)

This comment summarizes three medium-severity performance / cost-of-operations findings from an automated review pass (Xavier remoras on origin/sp-files vs origin/main). They are not correctness or security blockers; they describe tradeoffs and tuning opportunities.

Where to look: PR “Files changed”packages/appkit/src/plugins/files/plugin.ts (search _readSettings, _handleRead).


1) HTTP /read no longer uses read-tier cache; uses download path

Summary: /read loads bytes via connector.download + _downloadSettings. Inline comments note /read no longer participates in the read-tier cache. For service-principal volumes, hot repeated /read of the same small object may hit the SDK/network every time instead of benefiting from the read cache → higher latency, API usage, and load.

Anchor: _handleRead — block calling connector.download(getWorkspaceClient(), path) with _downloadSettings(mode) (~L821–L833 on sp-files).


2) /read accumulates stream chunks then Buffer.concat

Summary: The handler reads via getReader(), chunks.push(value) per read, then Buffer.concat(chunks, bytesRead). Many tiny chunks → extra allocations, a large chunks array, and GC churn, while staying within maxReadSize. Peak memory can spike briefly.

Anchor: _handleRead reader loop + Buffer.concat (~L843–L873 on sp-files).


3) OBO volumes: read-tier cache disabled by design

Summary: _readSettings sets cache.enabled: false when authMode === "on-behalf-of-user". Correctness motivation: cache keys tie to getCurrentUserId(), so cross-user staleness would be wrong for OBO without richer keying. Tradeoff: higher latency and backend cost on OBO-heavy workloads until e.g. per-(volume, path) generation counters (as noted in comments).

Anchor: _readSettings (isObo branch) (~L542–L555 on sp-files).


Bonus (docs — flagged by correctness + security remoras)

Stale NOTE on _enforcePolicy: The doc block still implies SDK execution stays on the service principal and OBO is a “later phase,” but handlers now use _runWithAuth / runInUserContext for OBO. Updating that NOTE avoids maintainer confusion (search for “still uses the service principal until a later phase”).


Panel verdict (context): correctness approve, security approve, performance approve — no request changes / rethink from the automated panel; items above are follow-ups / polish.

Typo fix per review (OBO = on-behalf-of-user, not "Obi").
Touches the declaration, the single call site in `_enforcePolicy`, and
one test-comment reference.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
The NOTE claimed SDK calls "still use the service principal until a later
phase wires the actual OBO runInUserContext plumbing" — but that phase
shipped: every handler now wraps in `_runWithAuth` / `runInUserContext`
when the volume resolves to OBO. Replace the NOTE with the current truth:
the policy and SDK identities are selected separately and converge per
the policy-user matrix.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants