Skip to content

Conversation

@saichandrapandraju
Copy link

Adds custom detectors (RegexDetector, FunctionDetector) that allow users to define their own detection logic.

This is useful when users want to implement custom detection behaviors, especially with user-provided goals or prompts (#1482).

Two new configurable detector types are introduced:


1. RegexDetector – Pattern Matching

garak --detectors custom.RegexDetector \
      --detector_options '{"custom": {"RegexDetector": {"patterns": ["api.?key", "sk-[A-Za-z0-9]{32,}"], "match_type": "any", "case_sensitive": false}}}'

Users must provide patterns, while match_type and case_sensitive are optional.

  • match_type defaults to "any". Allowed values: "any", "all".
  • case_sensitive defaults to false.

2. FunctionDetector – Custom Python Logic

(inspired by the function.py generator)

garak --detectors custom.FunctionDetector \
      --detector_options '{"custom": {"FunctionDetector": {"function_name": "mymodule#check_harmful"}}}'

Verification

List the steps needed to make sure this thing works

  • Supporting configuration such as generator configuration file
{"openai": {"OpenAICompatible": {"uri": "https:<placeholder>/v1", "model": "qwen2", "api_key": "DUMMY"}}}
  • garak --detectors custom.RegexDetector --detector_options '{"custom": {"RegexDetector": {"patterns": ["api.?key", "sk-[A-Za-z0-9]{32,}"]}}}' -t openai.OpenAICompatible -n qwen2 --generator_options '{"openai": {"OpenAICompatible": {"uri": "http://localhost:8080/v1", "model": "qwen2", "api_key": "dummy"}}}'
  • garak --detectors custom.FunctionDetector --detector_options '{"custom": {"FunctionDetector": {"function_name": "mymodule#check_harmful"}}}' -t openai.OpenAICompatible -n qwen2 --generator_options '{"openai": {"OpenAICompatible": {"uri": "http://localhost:8080/v1", "model": "qwen2", "api_key": "dummy"}}}'
  • Run the tests and ensure they pass python -m pytest tests/
  • Verify the thing does what it should
  • Verify the thing does not do what it should not
  • Document the thing and how it works (Example)

@leondz
Copy link
Collaborator

leondz commented Nov 17, 2025

This could be really cool, thanks. We'll take a look!

@leondz leondz self-requested a review November 19, 2025 08:05
Copy link
Collaborator

@leondz leondz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first draft - needs some validation from our side. do you have some example cases for testing?

Comment on lines +9 to +22
CLI Examples:
# Regex detector
garak --detectors custom.RegexDetector \\
--detector_options '{"custom": {"RegexDetector": {"patterns": ["api.?key","sk-[A-Za-z0-9]{32,}"]}}}

# Function detector
garak --detectors custom.FunctionDetector \\
--detector_options '{"custom": {"FunctionDetector": {"function_name": "mymodule#check_harmful"}}}' \\
--probes dan.Dan_11_0

# Or use config file
garak --detectors custom.RegexDetector \\
--detector_option_file detector_config.json \\
--probes dan.Dan_11_0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please manage through Configurable mechanism and not CLI

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cli command based on a detector config file is viable here, however reference to using --config for a consolidated example may add clarity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, yes, missed that these were viable existing options

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the name "custom" add much? I can see it makes sense to group them as done here. Though looking at the probe names, we have probes.function - detectors.function seems a fine analogue to that. And then detectors.regex could work for the other. I'll think on this some more.

DEFAULT_PARAMS = Detector.DEFAULT_PARAMS | {
"patterns": [], # users must provide patterns
"match_type": "any", # "any" or "all"
"case_sensitive": False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it makes sense to expose multiline / re.M here?

Copy link
Collaborator

@jmartin-tech jmartin-tech Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to offer this as a set of strings corresponding to allowed enum values from re.RegexFlag to | together. Named something like re_flags?

This could even be initialized to the re.NOFLAG value:

Suggested change
"case_sensitive": False,
"re_flags": [ "NOFLAG" ],

If accepting this idea be sure to add validation in __init__.

# Validate match_type
self.match_type = str(self.match_type).lower()
if self.match_type not in ("any", "all"):
raise ValueError(f"match_type must be 'any' or 'all', got '{self.match_type}'")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please log and skip, rather than risking aborting the run

re.compile(pattern, flags) for pattern in self.patterns
]
except re.error as e:
raise ValueError(f"Invalid regex pattern: {e}") from e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please log and skip, rather than risking aborting the run

pattern.search(output_text) for pattern in self.compiled_patterns
)
else:
raise ValueError(f"Invalid match_type: {self.match_type}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please log and skip, rather than risking aborting the run. Failing detector can safely return [None] * input_length. Would be best if we accepted detectors returning flat None on overall detector failure but I'm not sure that's possible yet.

Comment on lines +231 to +233
raise ValueError(
f"function_name must be in format 'module#function', got '{self.function_name}'"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what the best way of handling this is post-inference. There may be other detectors configured, for example. Please log and skip, rather than risking aborting the run

Comment on lines +238 to +243
try:
module = importlib.import_module(module_name)
except ImportError as e:
raise ImportError(
f"Could not import module '{module_name}' for FunctionDetector"
) from e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is module discarded once the constructor returns?

try:
self.detection_function = getattr(module, func_name)
except AttributeError as e:
raise AttributeError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please log and skip, rather than risking aborting the run


# Validate function is callable
if not callable(self.detection_function):
raise ValueError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please log and skip, rather than risking aborting the run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants