Skip to content

Stabilize native responses history and add cleaned Exposure Studio#67

Merged
skullcrushercmd merged 4 commits intomainfrom
feat/native-history-and-exposure-cleanup
Apr 5, 2026
Merged

Stabilize native responses history and add cleaned Exposure Studio#67
skullcrushercmd merged 4 commits intomainfrom
feat/native-history-and-exposure-cleanup

Conversation

@skullcrushercmd
Copy link
Copy Markdown
Contributor

@skullcrushercmd skullcrushercmd commented Apr 5, 2026

  • Analyze all issues to fix
  • Fix CodeQL CWE-916: replace createHash('sha256') with HMAC for owner scope derivation in nativeProviders.ts
  • Fix XSS: add escapeHtml/safeHref helpers and apply throughout app.js in exposure-studio
  • Fix Rust extract_html_title: use byte-level ASCII lowercasing instead of to_lowercase() to avoid UTF-8 index mismatches
  • Fix Rust scan loop: continue to next candidate URL on non-2xx responses, only break on 2xx success

Copilot AI review requested due to automatic review settings April 5, 2026 05:31
finally:
try:
lock_path.unlink(missing_ok=True)
except Exception:

import argparse
import json
import os

import argparse
import json
import os

import argparse
import json
import re
Comment on lines +23 to +37
from lane_shared import (
acquire_lane_lock,
SIGNAL_DEFINITIONS,
SignalDefinition,
env_or_default,
extract_signals,
iso_now,
load_last_good_lane,
read_json_file,
resolve_suggested_paths,
save_last_good_lane,
sanitize_text,
split_sentences,
write_json_file,
)
try {
return `key:${crypto
.createHash('sha256')
.update(normalizedApiKey)

Check failure

Code scanning / CodeQL

Use of password hash with insufficient computational effort High

Password from
an access to apiKey
is hashed insecurely.
Password from
an access to normalizedApiKey
is hashed insecurely.

Copilot Autofix

AI about 1 month ago

General approach: Ensure that when a sensitive secret (like an API key) is transformed into a stored/long‑lived identifier, we use a key derivation / password hashing function that is intentionally slow and parameterizable (e.g., crypto.scryptSync or crypto.pbkdf2Sync) rather than a fast hash or HMAC. This both aligns with CodeQL’s expectations and increases resistance to brute‑force attacks if these identifiers and the derivation parameters are ever exposed.

Best concrete fix here:

  • Replace the use of crypto.createHmac('sha256', 'anygpt-native-owner-scope') with a call to a password‑hashing / KDF function such as crypto.scryptSync, which is built into Node and requires no new dependencies.
  • Keep behavior as close as possible: we still want a deterministic, pseudonymous, truncated hex string derived from the API key, using the same static string as the salt so that existing scopes remain stable.
  • Implementation details:
    • Use crypto.scryptSync(normalizedApiKey, 'anygpt-native-owner-scope', 32) to derive 32 bytes.
    • Convert to hex and slice to 24 chars as before.
    • Wrap this in the existing try/catch block; on errors, retain the current fallback of normalizedApiKey.slice(0, 8).

Required code changes (all in apps/api/routes/nativeProviders.ts):

  • In buildNativeResponsesOwnerScope, lines 300–304, replace the HMAC-based derivation with crypto.scryptSync as described above.
  • No new imports are needed because we already import crypto from 'node:crypto'.

Suggested changeset 1
apps/api/routes/nativeProviders.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/apps/api/routes/nativeProviders.ts b/apps/api/routes/nativeProviders.ts
--- a/apps/api/routes/nativeProviders.ts
+++ b/apps/api/routes/nativeProviders.ts
@@ -297,11 +297,12 @@
 			: '';
 	if (!normalizedApiKey) return undefined;
 	try {
-		return `key:${crypto
-			.createHmac('sha256', 'anygpt-native-owner-scope')
-			.update(normalizedApiKey)
-			.digest('hex')
-			.slice(0, 24)}`;
+		const derivedKey = crypto.scryptSync(
+			normalizedApiKey,
+			'anygpt-native-owner-scope',
+			32
+		);
+		return `key:${derivedKey.toString('hex').slice(0, 24)}`;
 	} catch {
 		return `key:${normalizedApiKey.slice(0, 8)}`;
 	}
EOF
@@ -297,11 +297,12 @@
: '';
if (!normalizedApiKey) return undefined;
try {
return `key:${crypto
.createHmac('sha256', 'anygpt-native-owner-scope')
.update(normalizedApiKey)
.digest('hex')
.slice(0, 24)}`;
const derivedKey = crypto.scryptSync(
normalizedApiKey,
'anygpt-native-owner-scope',
32
);
return `key:${derivedKey.toString('hex').slice(0, 24)}`;
} catch {
return `key:${normalizedApiKey.slice(0, 8)}`;
}
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +1046 to +1050
error: buildSafeUpstreamErrorMessage(statusCode, {
label: `${family} native request`,
rateLimitMessage:
'Rate limit or quota exceeded at the upstream provider. Please retry later.'
}),
request.method === 'GET' &&
(subpath === '/models' || subpath === '/v1/models')
) {
const payload = await buildModelsPayload(request);
(subpath === '/models' || subpath === '/v1/models')
) {
const payload = await buildModelsPayload(request);
applyResponseHeaders(response, payload.headers);
completionTokens: usage.completionTokens
});
}
let responsePayloadForClient = parsedResponse;
}
}

function estimateTokens(content: unknown): number {
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0f83366301

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1666 to +1669
responsePayloadForClient = rewriteNativeResponsesResponseObject(
parsedResponse,
nativeResponsesHistoryContext.proxyResponseId
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Map proxy IDs before forwarding /responses/{id} requests

This code rewrites successful native /responses objects to a proxy-owned id, but only previous_response_id in request bodies is translated later; path-based operations never get remapped. In practice, a client that calls responses.retrieve(created.id) (or cancel/input-items endpoints using that returned id) will send the proxy id upstream and receive a not-found error because the upstream only knows its original id. Resolve stored upstream_response_id for path-based response-id routes before forwarding.

Useful? React with 👍 / 👎.

Comment thread apps/api/server.ts
} else {
console.log(' 𐄂 OpenRouter compatible routes disabled.');
}
app.use('/native', nativeProviderRouter);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Respect ENABLE_ route flags for native passthrough*

The native router is mounted unconditionally, so disabling a family with ENABLE_OPENAI_ROUTES=false (or similar flags) no longer prevents traffic to that family because /native/<family>/... remains available. This bypasses the documented per-router enable/disable controls and can unexpectedly expose providers operators intended to turn off. Gate native family handling behind the same route flags (or an equivalent native-specific toggle).

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “Exposure Studio” (SurfaceScope) demo app and introduces a native provider passthrough router with stabilized /responses history handling, while isolating API test harness data under .tmp/test-data.

Changes:

  • Added Exposure Studio (Rust + static frontend) plus Python research lane runners and shared lane utilities.
  • Introduced /native/:family/* passthrough router with proxy-owned Responses API ids, deterministic previous_response_id handling, and provider provenance persisted in history entries.
  • Updated API test harness/scripts to write isolated fixture files under apps/api/.tmp/test-data and added SDK/native responses regression coverage.

Reviewed changes

Copilot reviewed 32 out of 35 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
apps/exposure-studio/src/models.rs Defines JSON-serializable SurfaceScope domain and lane output models.
apps/exposure-studio/src/main.rs Implements Axum backend, JSON loading, enrichment logic, and approved-resource scanning endpoints.
apps/exposure-studio/run.sh Adds a unified runner for Rust app + Python lane modes with venv bootstrapping.
apps/exposure-studio/README.md Documents SurfaceScope purpose, safe-use guardrails, and lane workflows.
apps/exposure-studio/PRODUCT_SCOPE.md Captures product positioning, guardrails, and bounded improvement roadmap.
apps/exposure-studio/frontend/styles.css Adds frontend styling for the SurfaceScope dashboard UI.
apps/exposure-studio/frontend/index.html Adds the SurfaceScope dashboard layout and panels.
apps/exposure-studio/frontend/app.js Implements frontend data fetching and rendering for dashboard panels.
apps/exposure-studio/data/assets.json Adds demo asset inventory fixtures (authorized assets + approved resources).
apps/exposure-studio/data/findings.json Adds demo finding fixtures with workflow metadata (visibility, report status, grace).
apps/exposure-studio/data/compliance_sources.json Adds demo compliance/reference fixtures.
apps/exposure-studio/data/testing_methodologies.json Adds demo testing methodology fixtures.
apps/exposure-studio/data/vulnerability_intelligence.json Adds demo vulnerability intelligence fixtures.
apps/exposure-studio/data/research_web_targets.json Adds website lane seed targets for bounded crawling.
apps/exposure-studio/Cargo.toml Introduces the new surfacescope Rust crate and dependencies.
apps/exposure-studio/backend/youtube_research_runner.py Adds transcript-first YouTube lane runner with retries, caching, and “doctor” mode.
apps/exposure-studio/backend/website_research_runner.py Adds bounded same-origin website crawl lane runner with “doctor” mode.
apps/exposure-studio/backend/implementation_lane_runner.py Adds unified lane merger + health status computation + “doctor” aggregation.
apps/exposure-studio/backend/lane_shared.py Adds shared signal definitions, extraction helpers, and lane lock/snapshot utilities.
apps/exposure-studio/backend/requirements-youtube-research.txt Pins Python deps for YouTube transcription lane.
apps/exposure-studio/.lane-secrets.env.example Provides example env vars for cookies/proxy/retry tuning.
apps/exposure-studio/.gitignore Ignores venv, generated lane outputs, locks/snapshots, and cookie artifacts.
apps/api/server.ts Mounts the new native passthrough router under /native.
apps/api/routes/nativeProviders.ts Implements /native passthrough routing, failover/cooldowns, and responses-history rewriting/persistence.
apps/api/modules/responsesHistory.ts Extends stored history entries with owner scope and provider provenance fields.
apps/api/modules/providerIdentity.ts Adds provider family resolution utilities used by native routing.
apps/api/package.json Moves test harness providers/keys/models files under .tmp/test-data.
apps/api/dev/testWs.ts Reuses a shared default test API key constant.
apps/api/dev/testSetup.ts Writes isolated test fixtures under apps/api/.tmp/test-data and centralizes default test API key.
apps/api/dev/testOpenAISdk.ts Adds native /responses regression coverage (proxy ids + previous_response_id behavior).
apps/api/dev/testApi.ts Uses shared default test API key for API tests.
apps/api/dev/runTestServer.sh Boots test server with .tmp/test-data fixture paths and allows mock providers during test boot.
.gitignore Adds ignores for additional apps/api/.test-data/* fixture files (legacy path).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +306 to +314
fn extract_html_title(body: &str) -> String {
let lower = body.to_lowercase();
let open = lower.find("<title>");
let close = lower.find("</title>");
match (open, close) {
(Some(start), Some(end)) if end > start + 7 => body[start + 7..end].trim().to_string(),
_ => String::new(),
}
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract_html_title lowercases the entire body and then uses the byte indices from the lowercased string to slice the original body. For non-ASCII input, to_lowercase() can change UTF-8 byte length, which can make the slice indices invalid and panic at runtime. Consider searching in a byte-wise case-insensitive way (ASCII only) or locating the <title> bounds on the original string without reusing indices from a transformed copy.

Copilot uses AI. Check for mistakes.
Comment on lines +390 to +417
for url in candidate_urls_for_resource(resource) {
match client.get(url.clone()).send().await {
Ok(response) => {
final_url = response.url().to_string();
http_status = Some(response.status().as_u16());
server_header = response
.headers()
.get(header::SERVER)
.and_then(|value| value.to_str().ok())
.unwrap_or_default()
.to_string();
let body = response.text().await.unwrap_or_default();
page_title = extract_html_title(&body);
reachability = if http_status == Some(200) {
"reachable".to_string()
} else {
"reachable-with-error".to_string()
};
if page_title.is_empty() {
notes.push("HTTP response did not include an HTML title tag.".to_string());
}
break;
}
Err(error) => {
notes.push(format!("HTTP probe failed for {url}: {error}"));
}
}
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scan loop stops on the first HTTP response even when it is a non-200 (e.g. HTTPS returns 404), so protocol fallbacks from candidate_urls_for_resource() (https→http) may never be attempted. If the intent is to find a reachable origin, consider continuing to the next candidate when the status is not successful (or at least not 2xx), while still recording the final attempt details when all candidates fail.

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +46
function renderAssets(assets) {
return `
<ul class="list">
${assets.map((asset) => `
<li class="card">
<div class="card-row">
<strong>${asset.name}</strong>
<span class="pill pill-${asset.authorization_state}">${asset.authorization_state}</span>
</div>
<p>${asset.kind} · ${asset.owner}</p>
${asset.resource_scope ? `<p class="detail"><strong>Scope:</strong> ${asset.resource_scope}</p>` : ''}
${asset.authorization_basis ? `<p class="detail"><strong>Authorization basis:</strong> ${asset.authorization_basis}</p>` : ''}
${asset.authorization_reference ? `<p class="detail"><strong>Authorization proof:</strong> ${asset.authorization_reference}</p>` : ''}
<p class="muted">${asset.notes}</p>
</li>
`).join('')}
</ul>
`;
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The frontend renders API-provided strings via template literals into innerHTML without escaping (e.g. asset.name, asset.notes, etc.). If any of this data can be influenced by users/operators, this is an XSS risk. Prefer creating DOM nodes and setting textContent, or escape interpolated values before inserting HTML.

Copilot uses AI. Check for mistakes.
@skullcrushercmd
Copy link
Copy Markdown
Contributor Author

@copilot automatically fix issues codeql found with pr

- nativeProviders.ts: replace createHash('sha256') with createHmac for
  owner scope ID derivation (fixes CodeQL CWE-916 insufficient hash effort)
- app.js: add escapeHtml()/safeHref() helpers and apply throughout all
  render functions to prevent XSS via innerHTML template interpolation
- main.rs: fix extract_html_title to use byte-level ASCII lowercasing
  (avoids UTF-8 index mismatches from to_lowercase() on non-ASCII input)
- main.rs: fix scan_resource loop to continue on non-2xx responses,
  only break on successful 2xx (allows http fallback after https 404)

Agent-Logs-Url: https://github.com/AnyVM-Tech/AnyGPT/sessions/212a346a-4054-4a08-9865-7cb1c6b439f2

Co-authored-by: skullcrushercmd <93234024+skullcrushercmd@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 5, 2026

@copilot automatically fix issues codeql found with pr

Fixed all CodeQL and reviewer-flagged issues in commit 248f840:

  1. CodeQL CWE-916 (nativeProviders.ts:302): Replaced crypto.createHash('sha256') with crypto.createHmac('sha256', 'anygpt-native-owner-scope') — HMAC is a keyed MAC for deterministic scope ID derivation, not a bare password hash, which resolves the "insufficient computational effort" alert.

  2. XSS (app.js): Added escapeHtml() (escapes &, <, >, ", ') and safeHref() (validates http:/https: protocol) helpers, applied throughout all render functions that interpolate API-provided strings into innerHTML.

  3. Rust UTF-8 index safety (main.rs:306): Fixed extract_html_title to use byte-level ASCII lowercasing (bytes().map(|b| b.to_ascii_lowercase())) instead of to_lowercase(), so byte indices stay valid for non-ASCII input (avoids the panic-at-runtime risk).

  4. Rust scan loop (main.rs:390): The loop now only breaks on a successful (2xx) HTTP response; non-2xx responses update the recorded state and continue to the next candidate URL, allowing protocol fallbacks (e.g. http: after an https: 404) to be attempted.

@skullcrushercmd skullcrushercmd merged commit 3a6d4c3 into main Apr 5, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants