Conversation
Proposes replacing vMCP's declarative composite tools system (DAG + Go templates) with a Starlark scripting engine for multi-step tool workflows. Starlark provides iteration, conditional branching, dynamic dispatch, and data transformation while maintaining sandboxed execution with no arbitrary I/O. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove V2 session incompatibility bullet (temporary concern) - Remove V1/V2 session goal (not relevant to RFC scope) - Add parallel execution to Goals section - Remove migration tooling phase (unnecessary) - Remove fuzz tests from testing strategy (unnecessary) - Simplify migration path (3 phases instead of 4) - Add docs-website to documentation requirements - Resolve naming question: scripted and composite are interchangeable - Remove json.encode/decode open question Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Starlark's sandbox makes it feasible for agents to dynamically compose and submit scripts to vMCP at runtime — something a declarative YAML DSL could never support safely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Six practical examples covering structured data manipulation, returning structured data, JSON-as-string parsing from legacy servers, fan-out with parallel, error handling patterns, and elicitation. Includes a callout explaining when to use dict indexing (tool results) vs attribute access (builtin return values). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
This RFC is incredible! Seems like I'm the first human reviewer and considering that, honestly I'm impressed by how thorough it already is! I do have some thoughts but honestly, none of them are blockers, I just wanted to give more than a "Looks good to me". I did have to scrape the barrel to think of some things that might be actionable:
|
|
Thanks for the thoughtful review @kantord 😄
This is an important thing to call out. I think it can be solved by building an implementation of Iterable. That would look like something within vMCP recognizes "hey, this is a huge array / dict / paginated response. Let's turn it into an
Can you say more about what you're imagining? I'm not that familiar with MCP apps.
Yes, that makes perfect sense. If you want a http tool, users could add a fetch tool.
Yea, I could see it getting hairy too. If we get to the point people (or agents) are writing so much code that we think reuse is important, then that's a good problem to have. I like the ideas you have though. Let's wait until this problem needs more attention.
Yes, it could be frustrating, especially with the schemas of the underlyling MCP servers potentially changing. Some thoughts here:
I think this is another thing where we have to wait for the problem to arise to know what solution is justified. |
aponcedeleonch
left a comment
There was a problem hiding this comment.
Great RFC — the Starlark choice is well-justified, the builtin API is clean, and the phased rollout is the right approach. Left a few inline comments on typed elicitation, migration tooling, and memory limits. Overall this looks solid.
|
|
||
| **`parallel(fns)`** executes a list of zero-argument callables concurrently on the Go side using `errgroup`: | ||
|
|
||
| ```python |
There was a problem hiding this comment.
Suggestion: Typed elicitation via schema validation
Today the elicitation handler in ToolHive (pkg/vmcp/composer/elicitation_handler.go) validates response size and depth but does not validate content against the provided JSON Schema — the schema is sent to the client purely for UI rendering.
Since scripts already provide a schema to elicit(), the Go-side builtin could validate decision.content against that schema before returning it to the script. This means every script that uses elicitation can trust .content without defensive type checks.
Concretely, add a validate parameter (default True):
decision = elicit(
"Approve?",
schema={
"type": "object",
"properties": {
"reason": {"type": "string"},
"severity": {"type": "string", "enum": ["low", "medium", "high"]},
},
"required": ["reason"],
},
validate=True, # default
)
# decision.content is guaranteed to match the schema if action == "accept"The Go side would use a JSON Schema validator (e.g., santhosh-tekuri/jsonschema) to enforce this. On validation failure, the builtin could either re-prompt the client or return a structured error.
This is a small addition to the builtin but makes every script that uses elicitation simpler and safer.
|
|
||
| The Starlark engine is designed to be extensible: | ||
|
|
||
| - New builtins can be added without breaking existing scripts |
There was a problem hiding this comment.
Suggestion: Migration tooling (automated transpiler)
The 3-phase migration plan is good, but Phase 2 (deprecation) would be much smoother with concrete migration tooling. The current composite tool model is a strict subset of what Starlark can express — every construct has a direct translation:
| Composite YAML | Starlark Equivalent |
|---|---|
Sequential steps (dependsOn chain) |
Sequential call_tool() calls |
| Parallel steps (same DAG level) | parallel([...]) |
condition template |
if statement |
onError: continue |
try_call_tool() |
onError: retry |
retry(lambda: call_tool(...)) |
| Elicitation step | elicit() |
onDecline/onCancel actions |
if decision.action == "decline" |
Go template {{.steps.X.output.Y}} |
Variable assignment: x = call_tool(...); x["Y"] |
defaultResults |
Default in try_call_tool fallback |
OutputConfig properties |
Return dict construction |
A transpiler built into vMCP could:
- Parse
CompositeToolConfig(already done at config load time) - Topologically sort the steps (reuse
dag_executor.go'sbuildExecutionLevels) - Emit Starlark source — sequential calls within levels,
parallel()across levels - Convert Go template expressions to Python string formatting
- Output the
.starfile or inline script
This could be exposed as:
- A
thv vmcp migrate-composite <tool-name>CLI command that prints the equivalent Starlark - A deprecation warning at config load: "Composite tool 'X' can be migrated to Starlark. Run
thv vmcp migrate-composite Xto see the equivalent script." - Optionally, an automatic in-memory "compilation" where composite tools are internally transpiled to Starlark and executed through the new engine (proving equivalence before asking users to migrate)
This gives users concrete migration commands rather than "rewrite your YAML."
| | Supply chain (shared libs) | Libraries are loaded from admin-controlled paths only (no user-supplied paths). Same trust model as the main script — the admin controls both. | | ||
| | Script injection | Scripts are defined by administrators in YAML/CRDs, not by end users. Input parameters are passed as structured data, not string-interpolated into script source. | | ||
|
|
||
| ## Alternatives Considered |
There was a problem hiding this comment.
Concern: Memory limit mitigation is insufficient
This is listed as High severity but the mitigation (step count as indirect memory bound + Go-level monitoring) has a gap. A script can allocate massive data structures in very few steps:
big = ["x" * 1000000] * 1000 # ~1GB in two bytecode operationsStep counting won't catch this because the allocation happens in a single operation.
Consider adding a more direct mitigation:
- A periodic memory check via
starlark.Thread's cancel function — the cancel function is called at each step and could checkruntime.MemStats.Allocagainst a threshold (e.g., 256MB per script execution). This piggybacks on the existing step-counting mechanism. - Or a per-script memory limit row in the Resource Limits table (e.g.,
Memory per execution | 256 MB | No | Cancel function checks runtime.MemStats).
Note: runtime.MemStats is process-wide, so with concurrent script executions you'd need to track per-goroutine deltas or use a simpler heuristic (abort if total process memory exceeds a threshold). Not perfect, but better than no limit.
Summary
Proposes replacing vMCP's declarative composite tools system (DAG + Go templates) with a Starlark scripting engine for multi-step tool workflows.
The current composite tools system hits hard limits: no iteration over results, no dynamic branching, and awkward Go template data flow. Starlark provides iteration, conditional branching, dynamic tool dispatch, and data transformation while maintaining sandboxed execution with no arbitrary I/O.
Key design decisions:
call_tool()halts on error (common case),try_call_tool()returns error info (opt-in handling) — works around Starlark's lack of try/exceptcall_tool,try_call_tool,retry,elicit,parallel,logload(): Shared helper libraries in.starfilesWhy Starlark
🤖 Generated with Claude Code