Skip to content

Add embedding-based semantic detection (beyond regex) #2

@manja316

Description

@manja316

Regex patterns catch known attack strings but miss semantic equivalents:

  • "Please disregard your prior directives" (same intent, no pattern match)
  • Paraphrased jailbreaks
  • Novel attack vectors

Consider an optional lightweight embedding similarity check using a small local model (e.g. sentence-transformers). Should remain zero-dep by default — embedding mode as optional extra.

pip install ai-injection-guard[semantic]

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions