DRAFT: Wire non-interactive registry auth for serve mode#4111
DRAFT: Wire non-interactive registry auth for serve mode#4111ChrisJBurns wants to merge 9 commits intomainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4111 +/- ##
==========================================
- Coverage 68.88% 68.70% -0.18%
==========================================
Files 461 464 +3
Lines 46562 46873 +311
==========================================
+ Hits 32075 32205 +130
- Misses 11987 12130 +143
- Partials 2500 2538 +38 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
jhrozek
left a comment
There was a problem hiding this comment.
Automated review from Claude Code. 14 findings total (3 HIGH, 6 MEDIUM, 5 LOW/INFO). 4 HIGH/MEDIUM findings are pre-existing from PR #3908 and should be tracked separately. 6 inline comments below on code introduced in this PR.
Pre-existing issues to track separately (from PR #3908):
- HIGH: Refresh token rotation not persisted on cache restore path (
oauth_token_source.go:94) - HIGH: Data race between
ResetDefaultProviderandGetDefaultProviderWithConfig(factory.go) - MEDIUM:
updateConfigTokenRefbypasses injected config provider (oauth_token_source.go:213) - MEDIUM: Panic on empty
baseURLinapi.NewClient(client.go:111)
MEDIUM (not inline — outside diff hunk): resolveTokenSource in factory.go:144 passes cfg.RegistryApiUrl directly, but auth.Login uses registryURLFromConfig(cfg) which falls back to cfg.RegistryUrl. Since the URL feeds DeriveSecretKey, a mismatch means Login could store the token under a different key than factory looks up.
🤖 Generated with Claude Code
pkg/registry/auth/login.go
Outdated
| } | ||
| hash := sha256.Sum256([]byte(registryURL)) | ||
| cacheFile, err := xdg.CacheFile(fmt.Sprintf("toolhive/cache/registry-%x.json", hash[:4])) | ||
| if err != nil { |
There was a problem hiding this comment.
[ARCHITECTURE] Duplicated cache path derivation (HIGH)
This sha256 → hash[:4] → "toolhive/cache/registry-%x.json" formula is duplicated from pkg/registry/provider_cached.go:74-75, where it uses the persistentCacheSubdir constant. Here it's hardcoded as "cache".
If the cache path formula ever changes in provider_cached.go (different hash length, different subdir), this logout code will silently fail to clear the correct cache file.
Suggestion: Extract a shared RegistryCachePath(registryURL string) (string, error) function in pkg/registry/ and call it from both places. Or add a ClearCache(url string) error function to the registry package.
pkg/api/v1/registry.go
Outdated
| // - RegistryAuth.Type == "" → ("none", "") — no auth configured | ||
| // - Type == "oauth", no CachedRefreshTokenRef → ("configured", "oauth") | ||
| // - Type == "oauth", CachedRefreshTokenRef present → ("authenticated", "oauth") | ||
| func resolveAuthStatus(cfg *config.Config) (authStatus, authType string) { |
There was a problem hiding this comment.
[ARCHITECTURE] resolveAuthStatus overlaps with AuthManager.GetAuthInfo() (MEDIUM)
This function reimplements the same config inspection as AuthManager.GetAuthInfo() in pkg/registry/auth_manager.go:81-91 (checks CachedRefreshTokenRef), but returns three-state strings vs boolean. GetAuthInfo() currently has zero production callers.
Per ToolHive's thin-wrapper principle, consider extending AuthManager with a GetAuthStatus() (status, authType string) method so the API layer just calls through.
pkg/api/v1/registry.go
Outdated
| if cfg.RegistryAuth.OAuth == nil || | ||
| cfg.RegistryAuth.OAuth.CachedRefreshTokenRef == "" { | ||
| return authStatusConfigured, "oauth" | ||
| } |
There was a problem hiding this comment.
[DESIGN QUESTION] Browser-open from server process (MEDIUM)
This endpoint calls auth.Login(...) with interactive=true, triggering performOAuthFlow which opens the user's default browser and starts a local callback listener.
Is this intentional for desktop-only serve mode (Studio running thv serve locally)? In headless/remote/container environments this will fail or hang.
Would it be better to return a redirect URL in the JSON response that the client (Studio) opens itself, rather than having the server process open the browser? That would also work behind reverse proxies where localhost:PORT/callback isn't reachable from the browser.
| @@ -0,0 +1,144 @@ | |||
| # Testing Registry Auth Serve Mode (PR 2A) | |||
There was a problem hiding this comment.
[CONVENTIONS] Test plan at repository root (LOW)
This manual test plan references a different branch (feat/registry-auth-serve-mode), mentions "PR 2B, not yet implemented" for login (which is now implemented), and includes an internal registry URL.
Consider removing from committed code and including in the PR description instead, or moving to docs/testing/ if it should persist.
There was a problem hiding this comment.
Yeah this was just me asking Claude to output an .md file so i can run through things manually locally. This was only committed so I didn't lose it across branches. This won't stay for long after I've proved everything locally.
| http.Error(w, "Failed to create secrets provider", http.StatusInternalServerError) | ||
| return | ||
| } | ||
|
|
There was a problem hiding this comment.
[CONVENTIONS] Missing omitempty on auth_status/auth_type (INFO)
The AuthStatus/AuthType fields on registryInfo and getRegistryResponse (lines 768-771, 798-801) serialize auth_type as "" when no auth is configured. This is inconsistent with other optional fields like Audience in UpdateRegistryAuthRequest which use omitempty.
Is this intentional for easier client parsing (always present)? If so, a brief comment would clarify.
pkg/registry/auth_manager.go
Outdated
| // Default to openid + offline_access if no scopes provided. | ||
| // offline_access is required to receive a refresh token from the provider. | ||
| if len(scopes) == 0 { | ||
| scopes = []string{"openid", "offline_access"} |
There was a problem hiding this comment.
[OAUTH] Default scopes defined in three locations (LOW)
The default ["openid", "offline_access"] scopes are defined here, in cmd/thv/app/config_registryauth.go (flag default), and a different set ["openid", "profile", "email"] in pkg/auth/oauth/oidc.go:227.
Consider extracting a single DefaultRegistryScopes constant to avoid divergence.
Add serve-mode support for registry authentication (RFC-0043 Phase 2A): - Add non-interactive registry provider for thv serve so browser-based OAuth flows are never triggered from API server context - Add auth_status and auth_type fields to registry API responses so desktop clients can display authentication state - Return structured JSON 503 errors when registry auth is required but no cached tokens are available (instead of generic 500) - Add typed RegistryHTTPError with Unwrap support for 401/403 detection - Produce actionable error messages when unauthenticated registries return 401 during validation probe Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Chris Burns <29541485+ChrisJBurns@users.noreply.github.com>
Fix 6 issues identified during code review: - Remove omitempty from auth_status/auth_type fields for consistent API - Add debug logging for config load errors instead of silently discarding - Merge GetNonInteractiveProviderWithConfig into GetDefaultProviderWithConfig with variadic ProviderOption to prevent singleton race - Split error handling for ErrRegistryAuthRequired vs ErrRegistryUnauthorized to preserve diagnostic context - Limit RegistryHTTPError body capture to 1024 bytes via io.LimitReader - Extract RegistryAuthRequiredCode constant for magic string Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The io.LimitReader fix was applied to GetServer and SearchServers but missed in fetchServersPage. Apply the same maxErrorBodySize limit for consistency and to prevent excessive memory usage from large error responses. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests for resolveAuthStatus, isRegistryAuthError, writeRegistryAuthRequiredError, and RegistryHTTPError removed per request. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The validation probe in NewAPIRegistryProvider detected 401 responses but did not wrap the error with ErrRegistryAuthRequired. This caused the API layer to return a generic 500 instead of the structured 503 with registry_auth_required code that Studio needs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add thv registry login and thv registry logout CLI commands plus POST /api/v1beta/registry/auth/login and /logout API endpoints for Studio to trigger OAuth flows in serve mode. Business logic lives in pkg/registry/auth/login.go with Login() and Logout() functions that reuse the existing oauthTokenSource infrastructure. CLI commands are thin wrappers. API endpoints are gated to serve mode only and reset the singleton provider after auth state changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add optional auth config (issuer, client_id, audience, scopes) to PUT /api/v1beta/registry/default so Studio can configure registry auth in one call - Default scopes to openid + offline_access when none provided - Always include offline_access in OAuth flow scopes to ensure the provider returns a refresh token for persistence - Add test instructions file for manual verification Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
f8aebd1 to
4713fe7
Compare
Move resolveAuthStatus logic from the API layer into AuthManager.GetAuthStatus() to eliminate duplication with GetAuthInfo(). Add comments explaining desktop-only serve mode for the login endpoint and the intentional omission of omitempty on auth_status/auth_type fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Why: When
thv serveruns with OAuth-authenticated registries, it must never open a browser for OAuth flows — it's a headless API server. Desktop clients (ToolHive Studio) also need machine-readable auth status in API responses to show the correct UI state, and structured error responses when auth is missing so they can prompt the user to runthv registry login.What: This is Phase 2A of RFC-0043 (Registry Authentication). It adds:
WithInteractive(false))auth_statusandauth_typefields in registry API responsescode: "registry_auth_required"when tokens are missingRegistryHTTPErrorwithUnwrap()for 401/403 detection from registry serversType of change
Test plan
task test)task lint-fix)Changes
pkg/registry/api/client.goErrRegistryUnauthorized,RegistryHTTPErrortype withUnwrap()for 401/403. Replace genericfmt.Errorfwith typed errors inGetServer,fetchServersPage,SearchServers.pkg/registry/factory.goGetNonInteractiveProviderWithConfig()for serve mode — creates provider withWithInteractive(false).pkg/registry/provider_api.goErrRegistryAuthRequiredinGetRegistry(). Produce actionable 401 message in validation probe.pkg/api/v1/registry.goauth_status/auth_typefields to response structs. AddresolveAuthStatus(),writeRegistryAuthRequiredError(),isRegistryAuthError(). Add serve-mode support toRegistryRoutes. Catch auth errors inlistRegistries,getRegistry,listServers.pkg/api/server.goserveMode: truetoRegistryRouter().pkg/registry/api/client_test.goRegistryHTTPErrorformatting,Unwrap(), anderrors.Isbehavior.pkg/api/v1/registry_auth_test.goresolveAuthStatus,isRegistryAuthError, andwriteRegistryAuthRequiredError.Does this introduce a user-facing change?
Yes. Registry API responses now include
auth_statusandauth_typefields when auth is configured. Whenthv servecannot authenticate to a registry (no cached tokens), it returns HTTP 503 with a structured JSON error instead of a generic 500.Special notes for reviewers
auth_statusfield cannot detect "expired" from config alone (would require calling the token source). We return "authenticated" when a token ref exists and rely on Studio handling subsequent 401s as a signal to re-login.thv serveAPI — the problem is thatthv serveitself lacks a registry credential. 503 correctly signals a server-side dependency issue.thv registry login/logoutCLI commands and aPOST /api/v1beta/registry/auth/loginendpoint so Studio can trigger the browser OAuth flow without the user needing a terminal.Generated with Claude Code