Skip to content

Conversation

@christian-bromann
Copy link
Member

This PR introduces a new middleware that integrates OpenAI's Moderation API into React agents, enabling automatic content policy enforcement at multiple stages of agent execution. The middleware can check user inputs, model outputs, and tool results for policy violations and handle them according to configurable behaviors.

It aligns with our implementation on the Python side introduced in langchain-ai/langchain#33492

Except for the following difference: instead of passing in an OpenAI client, we take an OpenAI model instance.

Implementation Details

  • Uses the moderateContent method from ChatOpenAI (requires related PR)
  • Integrates with LangChain's middleware system using beforeModel and afterModel hooks
  • Properly handles message extraction from different message types (HumanMessage, AIMessage, ToolMessage)
  • Supports jumpTo control flow to end execution when violations are detected
  • Validates that the provided model is an OpenAI model with moderation support

Basic Usage

import { createAgent, openAIModerationMiddleware } from "langchain";
import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({ model: "gpt-4o-mini" });

const middleware = openAIModerationMiddleware({
  model,
  checkInput: true,
  checkOutput: true,
  checkToolResults: false,
  exitBehavior: "end",
});

const agent = createAgent({
  model,
  tools: [...],
  middleware: [middleware],
});

const result = await agent.invoke({
  messages: [new HumanMessage("User input here")],
});

Custom Violation Message

const middleware = openAIModerationMiddleware({
  model,
  violationMessage:
    "Content flagged for {categories}. Detailed scores: {category_scores}",
  exitBehavior: "replace",
});

@changeset-bot
Copy link

changeset-bot bot commented Nov 14, 2025

⚠️ No Changeset found

Latest commit: 7704ffa

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@christian-bromann christian-bromann force-pushed the cb/openai-content-moderation-middleware branch from b3cee8c to 5dc0660 Compare November 19, 2025 22:55
@christian-bromann christian-bromann changed the base branch from cb/openai-content-moderation to main November 19, 2025 22:55
@christian-bromann christian-bromann force-pushed the cb/openai-content-moderation-middleware branch from 473f187 to 1015dc1 Compare November 21, 2025 00:25
Copy link
Member

@hntrl hntrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • how does streaming work with this middleware?
  • I don't expect pulling that pulling the client off of ChatOpenAI will work like you're hoping. BaseChatOpenAI only initializes a client when its invoked, which in the case of ChatOpenAI is never since it defers to the inner chat classes
  • because we're leaning on openai client, I think this makes another great case for us figuring out so we don't have to do weird stuff with type inferencing

@christian-bromann
Copy link
Member Author

  • how does streaming work with this middleware?

Streaming of the model response is not impacted by this middleware. The event stream "just" starts a bit delayed as the messages are being verified.

  • I don't expect pulling that pulling the client off of ChatOpenAI will work like you're hoping.

I am calling model._getClientOptions(); to initiate the actual client. I've also added an integration test to verify the middleware behavior from end to end.

  • because we're leaning on openai client, I think this makes another great case for us figuring out so we don't have to do weird stuff with type inferencing

💯

@hntrl
Copy link
Member

hntrl commented Nov 21, 2025

Gotcha. When I mentioned streaming I was more thinking token-by-token streaming, since in that case I can technically render/paint content that should be moderated since the moderation step hasn't kicked in. Think it's fine to ear mark that as a known feature gap

@christian-bromann christian-bromann merged commit 57f27d1 into main Nov 21, 2025
35 checks passed
@christian-bromann christian-bromann deleted the cb/openai-content-moderation-middleware branch November 21, 2025 06:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants