Skip to content

DRAFT: Wire non-interactive registry auth for serve mode#4111

Open
ChrisJBurns wants to merge 9 commits intomainfrom
feat/registry-auth-serve-mode
Open

DRAFT: Wire non-interactive registry auth for serve mode#4111
ChrisJBurns wants to merge 9 commits intomainfrom
feat/registry-auth-serve-mode

Conversation

@ChrisJBurns
Copy link
Collaborator

Summary

  • Why: When thv serve runs with OAuth-authenticated registries, it must never open a browser for OAuth flows — it's a headless API server. Desktop clients (ToolHive Studio) also need machine-readable auth status in API responses to show the correct UI state, and structured error responses when auth is missing so they can prompt the user to run thv registry login.

  • What: This is Phase 2A of RFC-0043 (Registry Authentication). It adds:

    • Non-interactive registry provider for serve mode (WithInteractive(false))
    • auth_status and auth_type fields in registry API responses
    • Structured JSON 503 errors with code: "registry_auth_required" when tokens are missing
    • Typed RegistryHTTPError with Unwrap() for 401/403 detection from registry servers
    • Actionable error messages when unauthenticated registries return 401

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)

Changes

File Change
pkg/registry/api/client.go Add ErrRegistryUnauthorized, RegistryHTTPError type with Unwrap() for 401/403. Replace generic fmt.Errorf with typed errors in GetServer, fetchServersPage, SearchServers.
pkg/registry/factory.go Add GetNonInteractiveProviderWithConfig() for serve mode — creates provider with WithInteractive(false).
pkg/registry/provider_api.go Wrap 401/403 from API as ErrRegistryAuthRequired in GetRegistry(). Produce actionable 401 message in validation probe.
pkg/api/v1/registry.go Add auth_status/auth_type fields to response structs. Add resolveAuthStatus(), writeRegistryAuthRequiredError(), isRegistryAuthError(). Add serve-mode support to RegistryRoutes. Catch auth errors in listRegistries, getRegistry, listServers.
pkg/api/server.go Pass serveMode: true to RegistryRouter().
pkg/registry/api/client_test.go Tests for RegistryHTTPError formatting, Unwrap(), and errors.Is behavior.
pkg/api/v1/registry_auth_test.go Tests for resolveAuthStatus, isRegistryAuthError, and writeRegistryAuthRequiredError.

Does this introduce a user-facing change?

Yes. Registry API responses now include auth_status and auth_type fields when auth is configured. When thv serve cannot authenticate to a registry (no cached tokens), it returns HTTP 503 with a structured JSON error instead of a generic 500.

Special notes for reviewers

  • The auth_status field cannot detect "expired" from config alone (would require calling the token source). We return "authenticated" when a token ref exists and rely on Studio handling subsequent 401s as a signal to re-login.
  • HTTP 503 (not 401) is intentional per RFC-0043: Studio is authenticated to the thv serve API — the problem is that thv serve itself lacks a registry credential. 503 correctly signals a server-side dependency issue.
  • Phase 2B (next PR) will add thv registry login/logout CLI commands and a POST /api/v1beta/registry/auth/login endpoint so Studio can trigger the browser OAuth flow without the user needing a terminal.

Generated with Claude Code

@github-actions github-actions bot added the size/M Medium PR: 300-599 lines changed label Mar 11, 2026
@ChrisJBurns ChrisJBurns changed the title Wire non-interactive registry auth for serve mode DRAFT: Wire non-interactive registry auth for serve mode Mar 11, 2026
@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 13.63636% with 133 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.70%. Comparing base (ba38a12) to head (cf7770e).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/api/v1/registry.go 10.83% 103 Missing and 4 partials ⚠️
pkg/registry/api/client.go 0.00% 13 Missing ⚠️
pkg/registry/provider_api.go 0.00% 11 Missing ⚠️
pkg/api/server.go 0.00% 1 Missing ⚠️
pkg/registry/factory.go 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4111      +/-   ##
==========================================
- Coverage   68.88%   68.70%   -0.18%     
==========================================
  Files         461      464       +3     
  Lines       46562    46873     +311     
==========================================
+ Hits        32075    32205     +130     
- Misses      11987    12130     +143     
- Partials     2500     2538      +38     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed size/S Small PR: 100-299 lines changed and removed size/M Medium PR: 300-599 lines changed size/S Small PR: 100-299 lines changed labels Mar 11, 2026
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed size/L Large PR: 600-999 lines changed and removed size/S Small PR: 100-299 lines changed size/M Medium PR: 300-599 lines changed labels Mar 11, 2026
Copy link
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review from Claude Code. 14 findings total (3 HIGH, 6 MEDIUM, 5 LOW/INFO). 4 HIGH/MEDIUM findings are pre-existing from PR #3908 and should be tracked separately. 6 inline comments below on code introduced in this PR.

Pre-existing issues to track separately (from PR #3908):

  • HIGH: Refresh token rotation not persisted on cache restore path (oauth_token_source.go:94)
  • HIGH: Data race between ResetDefaultProvider and GetDefaultProviderWithConfig (factory.go)
  • MEDIUM: updateConfigTokenRef bypasses injected config provider (oauth_token_source.go:213)
  • MEDIUM: Panic on empty baseURL in api.NewClient (client.go:111)

MEDIUM (not inline — outside diff hunk): resolveTokenSource in factory.go:144 passes cfg.RegistryApiUrl directly, but auth.Login uses registryURLFromConfig(cfg) which falls back to cfg.RegistryUrl. Since the URL feeds DeriveSecretKey, a mismatch means Login could store the token under a different key than factory looks up.

🤖 Generated with Claude Code

}
hash := sha256.Sum256([]byte(registryURL))
cacheFile, err := xdg.CacheFile(fmt.Sprintf("toolhive/cache/registry-%x.json", hash[:4]))
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCHITECTURE] Duplicated cache path derivation (HIGH)

This sha256 → hash[:4] → "toolhive/cache/registry-%x.json" formula is duplicated from pkg/registry/provider_cached.go:74-75, where it uses the persistentCacheSubdir constant. Here it's hardcoded as "cache".

If the cache path formula ever changes in provider_cached.go (different hash length, different subdir), this logout code will silently fail to clear the correct cache file.

Suggestion: Extract a shared RegistryCachePath(registryURL string) (string, error) function in pkg/registry/ and call it from both places. Or add a ClearCache(url string) error function to the registry package.

// - RegistryAuth.Type == "" → ("none", "") — no auth configured
// - Type == "oauth", no CachedRefreshTokenRef → ("configured", "oauth")
// - Type == "oauth", CachedRefreshTokenRef present → ("authenticated", "oauth")
func resolveAuthStatus(cfg *config.Config) (authStatus, authType string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ARCHITECTURE] resolveAuthStatus overlaps with AuthManager.GetAuthInfo() (MEDIUM)

This function reimplements the same config inspection as AuthManager.GetAuthInfo() in pkg/registry/auth_manager.go:81-91 (checks CachedRefreshTokenRef), but returns three-state strings vs boolean. GetAuthInfo() currently has zero production callers.

Per ToolHive's thin-wrapper principle, consider extending AuthManager with a GetAuthStatus() (status, authType string) method so the API layer just calls through.

if cfg.RegistryAuth.OAuth == nil ||
cfg.RegistryAuth.OAuth.CachedRefreshTokenRef == "" {
return authStatusConfigured, "oauth"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[DESIGN QUESTION] Browser-open from server process (MEDIUM)

This endpoint calls auth.Login(...) with interactive=true, triggering performOAuthFlow which opens the user's default browser and starts a local callback listener.

Is this intentional for desktop-only serve mode (Studio running thv serve locally)? In headless/remote/container environments this will fail or hang.

Would it be better to return a redirect URL in the JSON response that the client (Studio) opens itself, rather than having the server process open the browser? That would also work behind reverse proxies where localhost:PORT/callback isn't reachable from the browser.

@@ -0,0 +1,144 @@
# Testing Registry Auth Serve Mode (PR 2A)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONVENTIONS] Test plan at repository root (LOW)

This manual test plan references a different branch (feat/registry-auth-serve-mode), mentions "PR 2B, not yet implemented" for login (which is now implemented), and includes an internal registry URL.

Consider removing from committed code and including in the PR description instead, or moving to docs/testing/ if it should persist.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was just me asking Claude to output an .md file so i can run through things manually locally. This was only committed so I didn't lose it across branches. This won't stay for long after I've proved everything locally.

http.Error(w, "Failed to create secrets provider", http.StatusInternalServerError)
return
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CONVENTIONS] Missing omitempty on auth_status/auth_type (INFO)

The AuthStatus/AuthType fields on registryInfo and getRegistryResponse (lines 768-771, 798-801) serialize auth_type as "" when no auth is configured. This is inconsistent with other optional fields like Audience in UpdateRegistryAuthRequest which use omitempty.

Is this intentional for easier client parsing (always present)? If so, a brief comment would clarify.

// Default to openid + offline_access if no scopes provided.
// offline_access is required to receive a refresh token from the provider.
if len(scopes) == 0 {
scopes = []string{"openid", "offline_access"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[OAUTH] Default scopes defined in three locations (LOW)

The default ["openid", "offline_access"] scopes are defined here, in cmd/thv/app/config_registryauth.go (flag default), and a different set ["openid", "profile", "email"] in pkg/auth/oauth/oidc.go:227.

Consider extracting a single DefaultRegistryScopes constant to avoid divergence.

ChrisJBurns and others added 4 commits March 13, 2026 22:49
Add serve-mode support for registry authentication (RFC-0043 Phase 2A):

- Add non-interactive registry provider for thv serve so browser-based
  OAuth flows are never triggered from API server context
- Add auth_status and auth_type fields to registry API responses so
  desktop clients can display authentication state
- Return structured JSON 503 errors when registry auth is required but
  no cached tokens are available (instead of generic 500)
- Add typed RegistryHTTPError with Unwrap support for 401/403 detection
- Produce actionable error messages when unauthenticated registries
  return 401 during validation probe

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Chris Burns <29541485+ChrisJBurns@users.noreply.github.com>
Fix 6 issues identified during code review:
- Remove omitempty from auth_status/auth_type fields for consistent API
- Add debug logging for config load errors instead of silently discarding
- Merge GetNonInteractiveProviderWithConfig into GetDefaultProviderWithConfig
  with variadic ProviderOption to prevent singleton race
- Split error handling for ErrRegistryAuthRequired vs ErrRegistryUnauthorized
  to preserve diagnostic context
- Limit RegistryHTTPError body capture to 1024 bytes via io.LimitReader
- Extract RegistryAuthRequiredCode constant for magic string

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The io.LimitReader fix was applied to GetServer and SearchServers but
missed in fetchServersPage. Apply the same maxErrorBodySize limit for
consistency and to prevent excessive memory usage from large error
responses.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ChrisJBurns and others added 4 commits March 13, 2026 22:49
Tests for resolveAuthStatus, isRegistryAuthError,
writeRegistryAuthRequiredError, and RegistryHTTPError removed per
request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The validation probe in NewAPIRegistryProvider detected 401 responses
but did not wrap the error with ErrRegistryAuthRequired. This caused
the API layer to return a generic 500 instead of the structured 503
with registry_auth_required code that Studio needs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add thv registry login and thv registry logout CLI commands plus
POST /api/v1beta/registry/auth/login and /logout API endpoints for
Studio to trigger OAuth flows in serve mode.

Business logic lives in pkg/registry/auth/login.go with Login() and
Logout() functions that reuse the existing oauthTokenSource
infrastructure. CLI commands are thin wrappers. API endpoints are
gated to serve mode only and reset the singleton provider after
auth state changes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add optional auth config (issuer, client_id, audience, scopes) to
  PUT /api/v1beta/registry/default so Studio can configure registry
  auth in one call
- Default scopes to openid + offline_access when none provided
- Always include offline_access in OAuth flow scopes to ensure the
  provider returns a refresh token for persistence
- Add test instructions file for manual verification

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ChrisJBurns ChrisJBurns force-pushed the feat/registry-auth-serve-mode branch from f8aebd1 to 4713fe7 Compare March 13, 2026 22:56
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/L Large PR: 600-999 lines changed labels Mar 13, 2026
Move resolveAuthStatus logic from the API layer into
AuthManager.GetAuthStatus() to eliminate duplication with GetAuthInfo().
Add comments explaining desktop-only serve mode for the login endpoint
and the intentional omission of omitempty on auth_status/auth_type fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants