Composable middleware for structured LLM calls with TypeScript
@nmnmcc/toolbox is a TypeScript library for building structured, type-safe LLM
applications with composable middleware. It provides a functional approach to
defining language model interactions with strong type inference, schema
validation via Zod, and a middleware pattern inspired by web frameworks.
Key Features:
- Type-safe - Full TypeScript support with automatic type inference
- Composable - Middleware-based architecture for building complex behaviors
- Schema-driven - Uses Zod for runtime validation and structured outputs
- Extensible - Easy to create custom middlewares and initializers
- Installation
- TypeScript Compatibility
- The
describeFunction - Extending the Library
- Built-in Components
- License
npm install @nmnmcc/toolbox zod openaiThe library has peer dependencies on zod (^4) and openai (^6).
This library requires TypeScript 5.0+ with strict mode enabled.
| Package version | TypeScript | Node.js |
|---|---|---|
| 0.4.x+ | 5.0+ | 18+ |
This release aligns the documentation with the current runtime and type definitions in the codebase. If you're upgrading from earlier versions, note the following changes:
context.messages->context.history: The middleware context now exposes ahistoryproperty which is an array of tuples[message, completions], wherecompletionsis an array of OpenAI chat completions associated with the message.- Finalizers now receive the full
LanguageModelMiddlewareContextand return the parsed output value (not the context object). memorymiddleware options:max_messageswas renamed tomax_history.- For logging or middleware examples that need access to tokens/usage, use
result.historyto extract the latest completion usage:
const usage = result.history.at(-1)?.[1].at(-1)?.usage;The describe function is the core of the library. It has two forms:
Wrap any function with metadata for use as tools in ReAct agents:
import { z } from "zod";
import { describe } from "@nmnmcc/toolbox";
const add_numbers = describe(
{
name: "add_numbers",
description: "Add two numbers together",
input: z.object({
a: z.number().describe("First addend"),
b: z.number().describe("Second addend"),
}),
output: z.object({ sum: z.number().describe("Sum of both numbers") }),
},
async ({ a, b }) => {
return { sum: a + b };
},
);
const result = await add_numbers({ a: 1, b: 2 });
console.log(result.sum); // 3Create functions backed by language models with a middleware chain:
import { z } from "zod";
import { describe } from "@nmnmcc/toolbox";
import { initializer } from "@nmnmcc/toolbox/initializers/initializer";
import { finalizer } from "@nmnmcc/toolbox/finalizers/finalizer";
import { retry } from "@nmnmcc/toolbox/middlewares/retry";
import { logging } from "@nmnmcc/toolbox/middlewares/logging";
const summarize = describe(
{
name: "summarize",
description: "Summarize the provided text",
input: z.object({ text: z.string().describe("Text to summarize") }),
output: z.object({ summary: z.string().describe("A concise summary") }),
model: "gpt-4o",
temperature: 0.7,
},
[
initializer(
"You are a helpful assistant that summarizes text concisely.",
),
logging(),
retry(2),
finalizer(),
],
);
const result = await summarize({ text: "Long article text goes here..." });
console.log(result.summary);Type Signature:
type Description<Input, Output> = {
name: string;
description: string;
input: Input; // Zod schema
output: Output; // Zod schema
};
type LanguageModelDescription<Input, Output> = Description<Input, Output> & {
model: string; // OpenAI model name
temperature?: number;
max_tokens?: number;
// ... any OpenAI ChatCompletionCreateParams
client?: OpenAI; // Optional custom OpenAI client
};
// For regular functions
function describe<Input, Output>(
description: Description<Input, Output>,
implementation: (input: z.input<Input>) => Promise<z.output<Output>>,
): Described<Input, Output>;
// For LLM-powered functions
function describe<Input, Output>(
description: LanguageModelDescription<Input, Output>,
imports: [initializer, ...middlewares, finalizer],
): Described<Input, Output>;The library is designed to be extended with custom middleware. This section explains how to create your own middleware to add custom behavior to your LLM calls.
Middleware in this library follows a pattern similar to Express.js or Koa. Each
middleware is a function that receives a context and a next function, allowing
you to:
- Inspect or modify the request context before calling the LLM
- Call
next(context)to continue the chain - Inspect or modify the completion result after the LLM call
- Short-circuit the chain by returning early (e.g., for caching)
import type {
LanguageModelMiddleware,
LanguageModelMiddlewareContext,
LanguageModelMiddlewareNext,
LanguageModelOutputContext,
} from "@nmnmcc/toolbox";
type LanguageModelMiddleware<Input, Output> = (
context: LanguageModelMiddlewareContext<Input, Output>,
next: LanguageModelMiddlewareNext<Input, Output>,
) => Promise<LanguageModelMiddlewareContext<Input, Output>>;Context Structure:
type LanguageModelMiddlewareContext<Input, Output> = {
description: LanguageModelDescription<Input, Output>;
initializer: LanguageModelInitializer<Input, Output>;
middlewares: LanguageModelMiddleware<Input, Output>[];
finalizer: LanguageModelFinalizer<Input, Output>;
usage: OpenAI.CompletionUsage;
input: z.output<Input>;
tools?: OpenAI.Chat.Completions.ChatCompletionFunctionTool[];
history: [
OpenAI.Chat.ChatCompletionMessageParam,
OpenAI.Chat.Completions.ChatCompletion[],
][];
};
type LanguageModelOutputContext<Input, Output> =
& LanguageModelMiddlewareContext<Input, Output>
& { output: z.output<Output> };import type { LanguageModelMiddleware } from "@nmnmcc/toolbox";
const simple_logger = <Input, Output>(): LanguageModelMiddleware<
Input,
Output
> => {
return async (context, next) => {
console.log(`[${context.description.name}] Starting call`);
const start = Date.now();
const result = await next(context);
const elapsed = Date.now() - start;
console.log(`[${context.description.name}] Completed in ${elapsed}ms`);
return result;
};
};import type { LanguageModelMiddleware } from "@nmnmcc/toolbox";
const add_prefix = <Input, Output>(
prefix: string,
): LanguageModelMiddleware<Input, Output> => {
return async (context, next) => {
const result = await next(context);
// Modify the response content
const message = result.history.at(-1)?.[1].at(-1)?.choices[0]?.message;
if (message?.content) {
message.content = prefix + message.content;
}
return result;
};
};import type { LanguageModelMiddleware } from "@nmnmcc/toolbox";
const simple_cache = <Input, Output>(
store: Map<string, any>,
): LanguageModelMiddleware<Input, Output> => {
return async (context, next) => {
const cache_key = `${context.description.name}:${
JSON.stringify(context.input)
}`;
// Check cache
const cached = store.get(cache_key);
if (cached) {
console.log("Cache hit!");
return cached;
}
// Call LLM and cache result
const result = await next(context);
store.set(cache_key, result);
return result;
};
};import type { LanguageModelMiddleware } from "@nmnmcc/toolbox";
const error_handler = <Input, Output>(
on_error: (error: Error) => void,
): LanguageModelMiddleware<Input, Output> => {
return async (context, next) => {
try {
return await next(context);
} catch (error) {
on_error(error as Error);
throw error;
}
};
};import type { LanguageModelMiddleware } from "@nmnmcc/toolbox";
const add_context = <Input, Output>(
additional_context: string,
): LanguageModelMiddleware<Input, Output> => {
return async (context, next) => {
// Add additional context to history
const modified_context = {
...context,
history: [
...context.history,
[{ role: "system" as const, content: additional_context }, []],
],
};
return await next(modified_context);
};
};import type { LanguageModelMiddleware } from "@nmnmcc/toolbox";
const aggregator = <Input, Output>(
...middlewares: LanguageModelMiddleware<Input, Output>[]
): LanguageModelMiddleware<Input, Output> => {
return async (context, next) => {
const chain = middlewares.reduceRight(
(prev, curr) => (ctx) => curr(ctx, prev),
next,
);
return chain(context);
};
};
// Usage
const standard_middlewares = aggregator(logging(), retry(3), timeout(30000));This library comes with several built-in components to help you get started quickly. For detailed documentation, see:
- Initializers - Convert input into initial message
arrays
initializer- Standard initializer with system prompt
- Middlewares - Composable middleware for common
patterns
cache- Cache LLM responsesmemory- Maintain conversation historyretry- Retry failed callslogging- Log execution metricstimeout- Add timeoutsreact- ReAct pattern for tool callingotel- OpenTelemetry tracingaggregator- Compose multiple middlewares
- Finalizers - Extract output from completions
finalizer- Standard JSON parser
This repo is wired up as a turborepo that runs tsdown builds and tsc checks
per package.
pnpm run buildrunsturbo run tsdown, which sequentially executes each package'stsdown --filter packages/<name>script. The sharedtsdown.config.tsensures a consistent Node 18 target, declaration generation, anddistoutput for every package.pnpm run typecheckrunsturbo run check, which invokestsc --noEmitinside every package.- Each package publishes the contents of
dist. After building, you can publish individual packages with standardpnpm --filter <name> publishcommands (or whatever release flow you prefer).
LGPL-2.1-or-later