Skip to content

fix: batch of 8 onboarding / MCP / harness / chat / auth bugs#39

Merged
DIodide merged 2 commits intostagingfrom
fix/ui-chat-bug-batch
Apr 21, 2026
Merged

fix: batch of 8 onboarding / MCP / harness / chat / auth bugs#39
DIodide merged 2 commits intostagingfrom
fix/ui-chat-bug-batch

Conversation

@DIodide
Copy link
Copy Markdown
Owner

@DIodide DIodide commented Apr 21, 2026

Summary

Batched fixes for 8 reported bugs spanning onboarding, MCP status display, the harness editor, chat streaming UX, and the sign-in/sign-up flow. Plus a small dev-only CORS convenience. All eight fixes were investigated against the original reports, implemented, and type-checked; the new staging test suite (79 Convex + 166 web + 74 FastAPI = 319 tests) still passes cleanly on top of this branch.


Per-bug breakdown

Bug #6Save Draft created harnesses with no model

Cause: handleSaveDraft silently fell back to "gpt-4o" when no model was selected and didn't validate name.
Fix: apps/web/src/routes/onboarding.tsx — validates name.trim() + model before mutating; Save Draft is disabled with a tooltip until both are set. Silent fallback removed.

Bug #7 — Onboarding Start chatting silently produced a draft

Cause: onSuccess always navigated to the same place regardless of status passed to the mutation.
Fix: same file — onSuccess branches on variables.status: "draft"/harnesses with a toast, otherwise → /chat.

Bug #8 — tiger_junction MCP servers showed connected before netid was linked

Cause: /mcp_health/check returned ok if the MCP endpoint was reachable, ignoring Princeton netid requirement. UI had no variant for "reachable but needs user action".
Fix (backend): packages/fastapi/app/routes/mcp_health.py — for tiger_junction servers, early-return auth_required whenever the resolved UserContext has no princeton_netid. Applies to both /check and /generate-prompts.
Fix (frontend): apps/web/src/components/mcp-server-status.tsx — new needs_verification status variant with a GraduationCap badge, plus distinct badges per authType (Public / Key / OAuth / Princeton).
Known caveat: the needs_verification badge only renders a tooltip — no click-through to the Princeton link flow (PrincetonConnectRow lives in harness settings). Filed as follow-up.

Bug #9 — Reconnecting an OAuth MCP server left stale status badges in the chat header

Cause: health-check effect in chat/index.tsx only ran on mount; reconnect had no way to re-trigger it.
Fix: apps/web/src/routes/chat/index.tsx — factored health check into runHealthCheck(servers) + refreshHealth() callbacks with a cancellation ref. Threaded onRefreshHealth through ChatHeaderMcpServerStatus; wired onReconnected={() => { onRefreshHealth(); onRefreshCommands(); }}.
Known caveat: only the status-popover reconnect path refreshes header state. Reconnecting from an inline OAuthReconnectPrompt inside a failed tool-call result still won't refresh badges. Left for a follow-up.

Bug #14 — Harness status was read-only in the editor

Cause: the UI had only a static status badge even though the Convex schema supports draft | started | stopped.
Fix: apps/web/src/routes/harnesses/\$harnessId.tsx — replaced the label with a <Select> (colored dot per state). Included in hasChanges and handleSave. Adds a hint "Saving will promote this harness out of draft" when transitioning from draft.
Known caveat: no client-side guard against promoting a malformed draft; server-side validation in harnesses:update should reject invalid promotions (unverified here).

Bug #17 — Interrupted assistant messages vanished on some failure paths

Cause: only the browser-side saveInterruptedMessage mutation existed. Backend-side failures (HTTP errors, max-tool-iterations) produced nothing in Convex, so users saw the bubble disappear.
Fix (schema): packages/convex-backend/convex/schema.ts + convex/messages.ts — added interruptionReason: v.optional(v.string()) to the messages table; saveAssistantMessage internalMutation accepts interrupted / interruptionReason args.
Fix (backend): packages/fastapi/app/routes/chat.py — new _save_interrupted() helper called from all three exception handlers (HTTPStatusError, HTTPError, generic) and the post-loop max-iterations path. It also records budget usage if captured. packages/fastapi/app/services/convex.pysave_assistant_message() gained interrupted + interruption_reason kwargs.
Known caveat: the frontend-path messages:saveInterruptedMessage does not yet accept interruptionReason. The two paths will diverge if a UI feature displays the reason later.

Bug #22 — First-response spinner felt broken; nothing happened until tokens arrived

Cause: streaming bubble mounted only once streamingContent had text, and the fallback was a static spinner.
Fix (new component): apps/web/src/components/pending-response-indicator.tsxRoseCurveSpinner + rotating flavor text (16 messages, 2.6s cadence, random starting index, honors prefers-reduced-motion).
Fix (callers): apps/web/src/routes/chat/index.tsx and apps/web/src/routes/workspaces/index.tsxshowStreamingBubble gate widened to (isStreaming || …) so the bubble mounts immediately; inline RoseCurveSpinner size={14} replaced with <PendingResponseIndicator />.

Bug #23 — Landing-page Get Started CTAs routed to sign-in

Cause: no /sign-up route existed; all CTAs linked to /sign-in; Clerk's footer "Sign up" link fell back to its hosted page.
Fix: new file apps/web/src/routes/sign-up.tsx mirroring sign-in.tsx with Clerk's <SignUp> and signInUrl="/sign-in". apps/web/src/routes/index.tsx — 4 Link to="/sign-in"/sign-up. apps/web/src/routes/sign-in.tsx — added signUpUrl="/sign-up" on <SignIn> so the widget's footer link routes locally. apps/web/src/routeTree.gen.ts regenerated by the TanStack router plugin.

Dev-only change — CORS allowlist

packages/fastapi/app/main.py — expanded local dev origins to cover localhost and 127.0.0.1 on ports 3000–3020, so running the web dev server on an alternate port (e.g. 3004) during bug verification no longer hits CORS preflight failures. Production and hosted origins are unchanged.


Test plan

  • / landing → click any Get Started CTA → lands on /sign-up with Clerk Create-account form.
  • /sign-up → "Sign in" footer link → lands on /sign-in. Round-trip works.
  • /onboardingSave Draft disabled with tooltip until name + model are set. After saving, land on /harnesses with a confirmation toast; harness is persisted as draft with the chosen model (no silent gpt-4o).
  • /onboardingStart chatting with valid inputs → lands on /chat, harness is started.
  • /harnesses/:id → status dropdown visible with draft/started/stopped options; changing draft → started shows the promotion hint; save persists.
  • /chat with tiger_junction MCP server and no Princeton netid linked → MCP status popover shows the graduation-cap "Needs verification" badge, not Connected.
  • /chat with OAuth MCP server → open MCP status popover → click Reconnect → after Clerk OAuth flow closes, header badges re-run the health check without a full reload.
  • /chat and /workspaces → send a prompt → rose-curve spinner + rotating italic flavor text appears immediately; first tokens replace it. With macOS Reduce Motion on, text stays on one phrase.
  • Interrupt a streaming response (close tab mid-stream or kill backend briefly) → reload the conversation → partial assistant message persists with interrupted: true and a populated interruptionReason in Convex.
  • All three new test suites still pass (apps/web, packages/convex-backend, packages/fastapi). Verified locally: 166 + 79 + 74 = 319 green.

Known follow-ups (intentionally out of scope)

  1. needs_verification badge should link into the Princeton connect flow (Bug Full application build: landing page, auth, chat with MCP tool use, harness management, and OAuth #8).
  2. Inline OAuthReconnectPrompt in tool-call results should also call onRefreshHealth (Bug Pin Clerk issuer to prevent JWKS endpoint spoofing #9).
  3. messages:saveInterruptedMessage (frontend mutation) should accept interruptionReason for parity with the backend path (Bug Feat/filesystems sandboxes #17).
  4. Server-side validation in harnesses:update should reject draft → started promotions that don't satisfy full-harness invariants (Bug Add Claude Code GitHub Workflow #14).
  5. Regression tests for the new code paths are not included in this PR — the staging test suite exercises the helpers I changed (save_assistant_message, parseAuthRequiredError, etc.) but does not yet assert the new interrupted/interruptionReason branches or the needs_verification UI state. Lines 78/80 of app/services/convex.py show up as uncovered in the FastAPI coverage report.

- #6, #7: onboarding Save Draft validates name + model, status-branched navigation
- #8: MCP tiger_junction servers show Needs Verification before netid link
- #9: chat header MCP badges refresh after OAuth reconnect
- #14: harness detail page exposes a draft/started/stopped status Select
- #17: interrupted assistant messages persist with interruptionReason across all error paths
- #22: first-response spinner uses RoseCurveSpinner with rotating flavor text
- #23: new /sign-up route, landing CTAs target it, Clerk cross-links work
- dev-only: CORS allowlist covers localhost and 127.0.0.1 ports 3000-3020
@DIodide
Copy link
Copy Markdown
Owner Author

DIodide commented Apr 21, 2026

@claude review this

@claude
Copy link
Copy Markdown

claude Bot commented Apr 21, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

Ports the refreshHealth callback plumbing from chat/index.tsx so the
McpServerStatus OAuth reconnect button re-runs the health check after a
reconnect — previously the status would stay stale until a harness swap.

Also adds pytest coverage for the interrupted / interruptionReason
branches in save_assistant_message that were added in this PR but
previously untested.
@DIodide DIodide merged commit d71b9a6 into staging Apr 21, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant