LCORE-1831: Implement Redaction Safety Capability in Pydantic AI#1915
LCORE-1831: Implement Redaction Safety Capability in Pydantic AI#1915arin-deloatch wants to merge 9 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a ChangesPII Redaction Capability
Sequence Diagram(s)sequenceDiagram
participant PydanticAI
participant PiiRedactionCapability
participant _redact_messages
participant _redact_response
participant redact_text
PydanticAI->>PiiRedactionCapability: before_model_request(ctx, request_context)
PiiRedactionCapability->>_redact_messages: messages, compiled_patterns
_redact_messages->>redact_text: user content strings
redact_text-->>_redact_messages: RedactionResult
_redact_messages-->>PiiRedactionCapability: redacted or original messages
PiiRedactionCapability-->>PydanticAI: updated request_context
PydanticAI->>PiiRedactionCapability: after_model_request(ctx, request_context, response)
PiiRedactionCapability->>_redact_response: response, compiled_patterns
_redact_response->>redact_text: TextPart.content strings
redact_text-->>_redact_response: RedactionResult
_redact_response-->>PiiRedactionCapability: redacted or original ModelResponse
PiiRedactionCapability-->>PydanticAI: ModelResponse
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
anik120
left a comment
There was a problem hiding this comment.
I know this is not part of the scope of this PR, but is src/pydantic_ai... leaking implementation detail again @jrobertboos?
|
|
||
|
|
||
| def _redact_message_parts( | ||
| parts: Sequence[Any], compiled_patterns: CompiledPatterns |
There was a problem hiding this comment.
Try to avoid Any in the whole module where possible.
|
|
||
| def _redact_string_content( | ||
| text: str, compiled_patterns: CompiledPatterns | ||
| ) -> str | None: |
There was a problem hiding this comment.
Prefer using Optional in the whole module.
f5586a7 to
d5a97ab
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pydantic_ai_lightspeed/capabilities/redaction/capability.py`:
- Line 1: Add module-level logging support by importing get_logger from log.py
and creating a logger instance at the top of the file after the module docstring
using logger = get_logger(__name__). This logger should then be used throughout
the capability module to audit PII redaction events, such as logging at debug
level when redaction rules match specific patterns and at info level for
redaction statistics and summaries. This approach aligns with coding guidelines
and provides valuable audit trails for the security-sensitive PII redaction
functionality.
- Line 5: Update all type annotations in the module to use modern pipe syntax
instead of Optional. Replace Optional[Type] with Type | None for all return type
annotations in the functions including _redact_text_content,
_redact_content_list, _redact_message_parts, and _redact_model_request. After
updating all function return types throughout the module, remove Optional from
the import statement at the top of the file since it will no longer be needed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7e09e4f4-9b9b-4dcf-a8b9-cf61a5e8e1e4
📒 Files selected for processing (10)
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/capability.pysrc/pydantic_ai_lightspeed/capabilities/redaction/config.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pytests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 3
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pysrc/pydantic_ai_lightspeed/capabilities/redaction/config.pysrc/pydantic_ai_lightspeed/capabilities/redaction/capability.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.py
🔇 Additional comments (6)
src/pydantic_ai_lightspeed/capabilities/redaction/config.py (1)
14-107: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.py (1)
1-157: LGTM!src/pydantic_ai_lightspeed/capabilities/redaction/capability.py (2)
31-258: LGTM!
261-325: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.py (1)
1-316: LGTM!src/pydantic_ai_lightspeed/capabilities/redaction/__init__.py (1)
1-21: LGTM!
|
/ok-to-test |
asimurka
left a comment
There was a problem hiding this comment.
I personally like this decomposition to capability, core and config modules.
|
/ok-to-test |
| @@ -0,0 +1,107 @@ | |||
| """Configuration models for PII redaction rules.""" | |||
There was a problem hiding this comment.
IMHO this config (or rather config classes/models) should be added into src/models/config.py. At least to avoid circular dependencies, to allow us to generate documentation etc.
There was a problem hiding this comment.
Understood, it was my initial understanding that we were to keep everything pydantic_ai related separate to avoid any sorts of conflicts with the core source code.
There was a problem hiding this comment.
However, I am more than happy to make the changes necessary! Just to be thorough, it is expected that RedactionRule,RedactionConfig and RedactionResult are being moved to src/models/config.py, correct?
There was a problem hiding this comment.
Correct. This will be later part of LCORE config so it will be technically implementation agnostic.
d5a97ab to
271c1f2
Compare
| @@ -0,0 +1,12 @@ | |||
| """Configuration models for PII redaction rules. | |||
There was a problem hiding this comment.
Remove this module entirely
| return list(self._compiled_patterns) | ||
|
|
||
|
|
||
| class RedactionResult(BaseModel): |
There was a problem hiding this comment.
This is not a configuration model so return it back to core.py
271c1f2 to
64a9445
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/pydantic_ai_lightspeed/capabilities/redaction/capability.py (1)
5-5: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winNormalize typing to repo conventions (
| None) and dependencyless capability generics.Line 5 and related return annotations still use
Optional[...], and Lines 260/279 keepAnyinAbstractCapability/RunContextfor a capability with no dependencies. Please align these annotations to keep type contracts consistent.As per coding guidelines, all functions must have complete type annotations for parameters and return types and use modern syntax (
str | int).Suggested refactor
-from typing import Any, Optional +from typing import Any @@ def _redact_string_content( text: str, compiled_patterns: CompiledPatterns -) -> Optional[str]: +) -> str | None: @@ def _redact_text_content( item: TextContent, compiled_patterns: CompiledPatterns -) -> Optional[TextContent]: +) -> TextContent | None: @@ def _redact_content_list( content: Sequence[UserContent], compiled_patterns: CompiledPatterns -) -> Optional[list[UserContent]]: +) -> list[UserContent] | None: @@ def _redact_message_parts( parts: Sequence[ModelRequestPart], compiled_patterns: CompiledPatterns -) -> Optional[list[ModelRequestPart]]: +) -> list[ModelRequestPart] | None: @@ def _redact_model_request( message: ModelRequest, compiled_patterns: CompiledPatterns -) -> Optional[ModelRequest]: +) -> ModelRequest | None: @@ -class PiiRedactionCapability(AbstractCapability[Any]): +class PiiRedactionCapability(AbstractCapability[None]): @@ - ctx: RunContext[Any], + ctx: RunContext[None], @@ - ctx: RunContext[Any], + ctx: RunContext[None],#!/bin/bash # Verify local typing consistency for capability generics and Optional usage. rg -nP 'AbstractCapability\[[^\]]+\]' src/pydantic_ai_lightspeed/capabilities -C2 rg -nP 'RunContext\[[^\]]+\]' src/pydantic_ai_lightspeed/capabilities -C2 rg -nP '\bOptional\[' src/pydantic_ai_lightspeed/capabilities/redaction/capability.py -C1Also applies to: 29-31, 47-50, 92-95, 145-148, 173-176, 259-261, 277-303
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/redaction/capability.py` at line 5, Update all type annotations in the redaction capability module to use modern Python typing syntax and remove unnecessary Any generics. Replace all instances of Optional[...] with the pipe union syntax (X | None) throughout the file on lines 5, 29-31, 47-50, 92-95, 145-148, 173-176. Additionally, update the generic type parameters for AbstractCapability and RunContext on lines 259-261 and 277-303 to remove Any and use empty brackets or appropriate type parameters based on whether the capability has dependencies, ensuring all function signatures and class definitions maintain consistent type contracts without relying on Any for capabilities without dependencies.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/models/config.py`:
- Around line 2222-2235: The RedactionRule model allows runtime mutation of the
mutable rules list and case_sensitive field, but the compiled regex patterns in
_compiled_patterns are only generated once and never invalidated when these
source fields change. To fix this, make the rules field immutable by changing
its type from list[RedactionRule] to tuple[RedactionRule, ...] or adding
frozen=True to prevent accidental mutations, and add a model validator that
recompiles _compiled_patterns whenever either rules or case_sensitive are
modified, ensuring the compiled patterns remain synchronized with the current
configuration state.
---
Duplicate comments:
In `@src/pydantic_ai_lightspeed/capabilities/redaction/capability.py`:
- Line 5: Update all type annotations in the redaction capability module to use
modern Python typing syntax and remove unnecessary Any generics. Replace all
instances of Optional[...] with the pipe union syntax (X | None) throughout the
file on lines 5, 29-31, 47-50, 92-95, 145-148, 173-176. Additionally, update the
generic type parameters for AbstractCapability and RunContext on lines 259-261
and 277-303 to remove Any and use empty brackets or appropriate type parameters
based on whether the capability has dependencies, ensuring all function
signatures and class definitions maintain consistent type contracts without
relying on Any for capabilities without dependencies.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6de4a328-dee3-40af-b542-60de7aa7e82d
📒 Files selected for processing (10)
src/models/config.pysrc/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/capability.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pytests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.py
📜 Review details
⏰ Context from checks skipped due to timeout. (7)
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 1
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pysrc/models/config.pysrc/pydantic_ai_lightspeed/capabilities/redaction/capability.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/pydantic_ai_lightspeed/capabilities/__init__.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🔇 Additional comments (9)
src/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-2: LGTM!src/pydantic_ai_lightspeed/capabilities/redaction/__init__.py (2)
15-21: LGTM!
3-13: No action needed. The import paths are correct and consistent with the codebase conventions.modelsis a sibling package at thesrc/level, sofrom models.config importis the correct absolute form per the coding guidelines. The difference in style (short absolute form for external sibling packages vs. fully qualified form for same-package imports) is consistent throughout the codebase and does not indicate an error.> Likely an incorrect or invalid review comment.tests/unit/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-2: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.py (1)
1-2: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.py (1)
1-158: LGTM!src/pydantic_ai_lightspeed/capabilities/redaction/core.py (1)
1-56: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.py (1)
1-154: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.py (1)
1-317: LGTM!
|
/ok-to-test |
|
|
||
| from pydantic import BaseModel, ConfigDict | ||
|
|
||
| CompiledPatterns = list[tuple[Pattern[str], str]] |
There was a problem hiding this comment.
IMHO if you move this line into utils/types.py you might be able to reuse it. The exactly same type (anonymous) is used in models/config.py in this PR (at three lines at least)
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/models/config.py (1)
2223-2264: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winKeep
_compiled_patternssynchronized with post-init config changes.
rulesandcase_sensitiveare still mutable after construction, but_compiled_patternsis only populated once incompile_patterns(). Any later mutation silently leaves redaction running with stale regexes. Freeze the config inputs or recompile on assignment so runtime behavior matches the current model state.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/models/config.py` around lines 2223 - 2264, Keep the redaction config cache in sync with mutable fields: in the config model’s compile_patterns logic, _compiled_patterns is only built during validation, so later changes to rules or case_sensitive can leave stale regexes. Either make those inputs immutable after construction or add assignment-time revalidation/recompilation so _compiled_patterns is refreshed whenever RedactionRule entries or the global flag change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/models/config.py`:
- Around line 2223-2264: Keep the redaction config cache in sync with mutable
fields: in the config model’s compile_patterns logic, _compiled_patterns is only
built during validation, so later changes to rules or case_sensitive can leave
stale regexes. Either make those inputs immutable after construction or add
assignment-time revalidation/recompilation so _compiled_patterns is refreshed
whenever RedactionRule entries or the global flag change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f6cdb24b-7d0b-4055-a774-d5be0c1ba0f4
📒 Files selected for processing (3)
src/models/config.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pysrc/utils/types.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/utils/types.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pysrc/models/config.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
🧠 Learnings (3)
📚 Learning: 2026-06-24T13:45:37.249Z
Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.
Applied to files:
src/utils/types.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pysrc/models/config.py
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🔇 Additional comments (1)
src/utils/types.py (1)
8-10: 🎯 Functional CorrectnessNo action required: Syntax is compatible with declared toolchain.
The repository explicitly requires
requires-python = ">=3.12,<3.14"inpyproject.tomland CI workflows target Python 3.12/3.13. The use of the Python 3.12 nativetypealias syntax is fully compliant.
|
@arin-deloatch please rebase |
…capability. Use Optional for nullable returns and substitute Any with UserContent, ModelRequestPart, and ModelResponsePart for type-safe message handling.
2a59cef to
62a26e0
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/models/config.py`:
- Around line 2389-2415: The config validator in compile_patterns only compiles
rule.pattern and misses invalid replacement templates, so bad config is deferred
until substitution time. Extend compile_patterns to validate each
rule.replacement during validation by compiling and checking the replacement
template against the same regex context, and raise a ValueError with clear
rule-specific details when it is invalid. Keep the existing regex compilation
path intact and store only validated pattern/replacement pairs in
_compiled_patterns.
In `@src/pydantic_ai_lightspeed/capabilities/redaction/capability.py`:
- Around line 29-40: Update the docstrings in _redact_string_content and the
other redaction helpers/hooks to use the repository’s required parameter header
format by replacing each Args: section with Parameters:. Keep the existing
descriptions intact and make the same docstring convention change consistently
across the referenced helper methods in capability.py so the documented
functions match the project standard.
- Around line 23-26: Update the redaction capability to import CompiledPatterns
from utils.types instead of relying on the incidental export from
redaction.core; keep redact_text imported from redaction.core, and mirror the
same import source change in the redaction tests so they no longer depend on
core.py’s internal annotations.
In `@src/pydantic_ai_lightspeed/capabilities/redaction/core.py`:
- Around line 28-41: The docstring for the redaction function currently uses the
wrong argument-section header. Update the docstring on the redact/apply function
in redaction/core.py to use the repository’s required `Parameters:` header
instead of `Args:`, while keeping the existing parameter descriptions and return
text unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: dd1e6d53-c03a-41a5-97a0-aeda374af0e6
📒 Files selected for processing (9)
src/models/config.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/capability.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pysrc/utils/types.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.py
📜 Review details
⏰ Context from checks skipped due to timeout. (4)
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 2
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/capabilities/redaction/core.pysrc/utils/types.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/models/config.pysrc/pydantic_ai_lightspeed/capabilities/redaction/capability.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/capabilities/redaction/__init__.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
🧠 Learnings (3)
📚 Learning: 2026-06-24T13:45:37.249Z
Learnt from: Jdubrick
Repo: lightspeed-core/lightspeed-stack PR: 1971
File: src/utils/markdown_repair.py:31-36
Timestamp: 2026-06-24T13:45:37.249Z
Learning: In the lightspeed-stack repository, docstrings must use the section header name "Parameters:" (not "Args:") for function arguments, even if the project references Google Python docstring conventions. Ensure docstrings follow the project’s established "Parameters:" header format for any documented function parameters.
Applied to files:
tests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pysrc/utils/types.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.pysrc/models/config.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.pysrc/pydantic_ai_lightspeed/capabilities/redaction/capability.py
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🪛 ast-grep (0.44.0)
src/models/config.py
[warning] 2410-2410: Regex pattern passed to re is built from a non-literal (variable, call, concatenation, or f-string) value. If that value is attacker-controlled it can introduce a malicious pattern with catastrophic backtracking (ReDoS). Use a hardcoded literal pattern, or validate/escape untrusted input with re.escape() and bound the regex complexity before compiling.
Context: re.compile(rule.pattern, flags)
Note: [CWE-1333] Inefficient Regular Expression Complexity.
(redos-non-literal-regex-python)
🔇 Additional comments (2)
src/pydantic_ai_lightspeed/capabilities/redaction/__init__.py (1)
1-21: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.py (1)
1-1: LGTM!
| @model_validator(mode="after") | ||
| def compile_patterns(self) -> Self: | ||
| """Compile regex patterns and reject invalid ones. | ||
|
|
||
| Per-rule ``case_sensitive`` overrides the global flag when set. | ||
|
|
||
| Raises: | ||
| ValueError: If any rule contains an invalid regex pattern. | ||
|
|
||
| Returns: | ||
| The validated configuration instance. | ||
| """ | ||
| global_case_sensitive = self.case_sensitive | ||
| compiled: CompiledPatterns = [] | ||
| for rule in self.rules: | ||
| effective = ( | ||
| rule.case_sensitive | ||
| if rule.case_sensitive is not None | ||
| else global_case_sensitive | ||
| ) | ||
| flags = 0 if effective else re.IGNORECASE | ||
| try: | ||
| pattern = re.compile(rule.pattern, flags) | ||
| except re.error as e: | ||
| raise ValueError(f"Invalid regex pattern: {rule.pattern}: {e}") from e | ||
| compiled.append((pattern, rule.replacement)) | ||
| self._compiled_patterns = compiled |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Fail fast on invalid replacement templates.
This validator only rejects bad regex patterns. Invalid replacement strings still get stored and will surface later when the redaction engine calls regex substitution, turning a bad config into a request-time exception instead of a startup-time validation failure.
Suggested fix
`@model_validator`(mode="after")
def compile_patterns(self) -> Self:
"""Compile regex patterns and reject invalid ones.
@@
flags = 0 if effective else re.IGNORECASE
try:
pattern = re.compile(rule.pattern, flags)
+ # Validate the replacement template too, so misconfigured
+ # backreferences/escapes fail during config load.
+ pattern.sub(rule.replacement, "")
except re.error as e:
- raise ValueError(f"Invalid regex pattern: {rule.pattern}: {e}") from e
+ raise ValueError(
+ f"Invalid redaction rule for pattern {rule.pattern!r}: {e}"
+ ) from e
compiled.append((pattern, rule.replacement))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @model_validator(mode="after") | |
| def compile_patterns(self) -> Self: | |
| """Compile regex patterns and reject invalid ones. | |
| Per-rule ``case_sensitive`` overrides the global flag when set. | |
| Raises: | |
| ValueError: If any rule contains an invalid regex pattern. | |
| Returns: | |
| The validated configuration instance. | |
| """ | |
| global_case_sensitive = self.case_sensitive | |
| compiled: CompiledPatterns = [] | |
| for rule in self.rules: | |
| effective = ( | |
| rule.case_sensitive | |
| if rule.case_sensitive is not None | |
| else global_case_sensitive | |
| ) | |
| flags = 0 if effective else re.IGNORECASE | |
| try: | |
| pattern = re.compile(rule.pattern, flags) | |
| except re.error as e: | |
| raise ValueError(f"Invalid regex pattern: {rule.pattern}: {e}") from e | |
| compiled.append((pattern, rule.replacement)) | |
| self._compiled_patterns = compiled | |
| `@model_validator`(mode="after") | |
| def compile_patterns(self) -> Self: | |
| """Compile regex patterns and reject invalid ones. | |
| Per-rule ``case_sensitive`` overrides the global flag when set. | |
| Raises: | |
| ValueError: If any rule contains an invalid regex pattern. | |
| Returns: | |
| The validated configuration instance. | |
| """ | |
| global_case_sensitive = self.case_sensitive | |
| compiled: CompiledPatterns = [] | |
| for rule in self.rules: | |
| effective = ( | |
| rule.case_sensitive | |
| if rule.case_sensitive is not None | |
| else global_case_sensitive | |
| ) | |
| flags = 0 if effective else re.IGNORECASE | |
| try: | |
| pattern = re.compile(rule.pattern, flags) | |
| # Validate the replacement template too, so misconfigured | |
| # backreferences/escapes fail during config load. | |
| pattern.sub(rule.replacement, "") | |
| except re.error as e: | |
| raise ValueError( | |
| f"Invalid redaction rule for pattern {rule.pattern!r}: {e}" | |
| ) from e | |
| compiled.append((pattern, rule.replacement)) |
🧰 Tools
🪛 ast-grep (0.44.0)
[warning] 2410-2410: Regex pattern passed to re is built from a non-literal (variable, call, concatenation, or f-string) value. If that value is attacker-controlled it can introduce a malicious pattern with catastrophic backtracking (ReDoS). Use a hardcoded literal pattern, or validate/escape untrusted input with re.escape() and bound the regex complexity before compiling.
Context: re.compile(rule.pattern, flags)
Note: [CWE-1333] Inefficient Regular Expression Complexity.
(redos-non-literal-regex-python)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/models/config.py` around lines 2389 - 2415, The config validator in
compile_patterns only compiles rule.pattern and misses invalid replacement
templates, so bad config is deferred until substitution time. Extend
compile_patterns to validate each rule.replacement during validation by
compiling and checking the replacement template against the same regex context,
and raise a ValueError with clear rule-specific details when it is invalid. Keep
the existing regex compilation path intact and store only validated
pattern/replacement pairs in _compiled_patterns.
| from pydantic_ai_lightspeed.capabilities.redaction.core import ( | ||
| CompiledPatterns, | ||
| redact_text, | ||
| ) |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win
Import CompiledPatterns from utils.types directly.
core.py only exposes this name incidentally because it imports the shared alias for its own annotations. Pulling the type from there makes this module depend on an accidental re-export, so a future cleanup in core.py would break this module and the related redaction tests. Import the shared alias from utils.types here and mirror that change in the tests.
Suggested change
-from pydantic_ai_lightspeed.capabilities.redaction.core import (
- CompiledPatterns,
- redact_text,
-)
+from pydantic_ai_lightspeed.capabilities.redaction.core import redact_text
+from utils.types import CompiledPatterns📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from pydantic_ai_lightspeed.capabilities.redaction.core import ( | |
| CompiledPatterns, | |
| redact_text, | |
| ) | |
| from pydantic_ai_lightspeed.capabilities.redaction.core import redact_text | |
| from utils.types import CompiledPatterns |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/pydantic_ai_lightspeed/capabilities/redaction/capability.py` around lines
23 - 26, Update the redaction capability to import CompiledPatterns from
utils.types instead of relying on the incidental export from redaction.core;
keep redact_text imported from redaction.core, and mirror the same import source
change in the redaction tests so they no longer depend on core.py’s internal
annotations.
| def _redact_string_content( | ||
| text: str, compiled_patterns: CompiledPatterns | ||
| ) -> Optional[str]: | ||
| """Redact PII from a string and return the redacted version if changed. | ||
|
|
||
| Args: | ||
| text: The string to redact. | ||
| compiled_patterns: Pre-compiled (pattern, replacement) pairs. | ||
|
|
||
| Returns: | ||
| The redacted string if redaction occurred, None otherwise. | ||
| """ |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Switch these Args: sections to Parameters:.
The new helper and hook docstrings are consistent otherwise, but the argument header does not match the repo’s required format. As per coding guidelines, documented functions should follow the project docstring convention. Based on learnings, this repository uses Parameters: instead of Args: for function arguments.
Also applies to: 47-58, 65-76, 92-103, 117-132, 145-156, 173-184, 191-205, 225-239, 277-291, 300-317
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/pydantic_ai_lightspeed/capabilities/redaction/capability.py` around lines
29 - 40, Update the docstrings in _redact_string_content and the other redaction
helpers/hooks to use the repository’s required parameter header format by
replacing each Args: section with Parameters:. Keep the existing descriptions
intact and make the same docstring convention change consistently across the
referenced helper methods in capability.py so the documented functions match the
project standard.
Sources: Coding guidelines, Learnings
| """Apply PII redaction rules to the given text. | ||
|
|
||
| Rules are applied sequentially in the order provided. Earlier rules | ||
| may affect later rule matches. | ||
|
|
||
| Args: | ||
| content: The text to redact. Not mutated. | ||
| compiled_patterns: Pre-compiled (pattern, replacement) pairs. | ||
|
|
||
| Returns: | ||
| A RedactionResult with the redacted content, a flag indicating | ||
| whether any substitution occurred, and the total substitution | ||
| count. | ||
| """ |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Use Parameters: instead of Args: in this docstring.
The function docstring is otherwise fine, but the argument section header does not match the repository’s required format. As per coding guidelines, documented functions should follow the project docstring convention. Based on learnings, this repository uses Parameters: rather than Args: for function arguments.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/pydantic_ai_lightspeed/capabilities/redaction/core.py` around lines 28 -
41, The docstring for the redaction function currently uses the wrong
argument-section header. Update the docstring on the redact/apply function in
redaction/core.py to use the repository’s required `Parameters:` header instead
of `Args:`, while keeping the existing parameter descriptions and return text
unchanged.
Sources: Coding guidelines, Learnings
|
/ok-to-test |
Description
Add a regex-based PII redaction capability for pydantic-ai agents. This introduces:
core.py):redact_text()function and immutableRedactionResultmodel forsequential regex-based text substitution
config.py):RedactionRuleandRedactionConfigPydantic models withcompile-time pattern validation and per-rule/global case sensitivity controls
capability.py):PiiRedactionCapabilityintegrating with pydantic-ai'sAbstractCapabilityto redact user prompts before model requests and model response text beforereturning to the caller
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
uv run make format— passes, no reformatsuv run make verify— all linters pass (black, pylint 10/10, pyright 0 errors, ruff, pydocstyle,mypy, lint-openapi)
uv run pytest tests/unit/pydantic_ai_lightspeed/capabilities/ -v— 51/51 tests passcapability.py)Summary by CodeRabbit
New Features
Bug Fixes