-
-
Notifications
You must be signed in to change notification settings - Fork 268
Update ASI06_Memory_and_Context_Poisoning .md #718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Josh-Beck
wants to merge
6
commits into
OWASP:main
Choose a base branch
from
Josh-Beck:patch-1
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
fc1a238
Update ASI06_Memory_and_Context_Poisoning .md
Josh-Beck 41c9683
Update ASI06_Memory_and_Context_Poisoning .md
Josh-Beck bce1ce7
added T&M
Josh-Beck f46d6c6
Update ASI06_Memory_and_Context_Poisoning .md
Josh-Beck 6103889
Update ASI06_Memory_and_Context_Poisoning .md
Josh-Beck 3c67e6a
Update ASI06_Memory_and_Context_Poisoning .md
Josh-Beck File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
53 changes: 37 additions & 16 deletions
53
...-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,28 +1,49 @@ | ||
| ## ASI06 – Memory & Context Poisoning | ||
| ## ASI06 - Memory & Context Poisoning | ||
|
|
||
| **Description:** | ||
|
|
||
| A brief description of the vulnerability that includes its potential effects such as system compromises, data breaches, or other security concerns. | ||
| **Description:** | ||
| LLM systems can be augmented with memory systems to improve performance. These systems, which are used to improve an LLM's performance, can include external memory stores like Retrieval-Augmented Generation (RAG) vector databases or extended context windows that hold user session data. These systems can be manipulated by adversaries, who can add malicious or misleading data to the agent's memory stores, resulting in latent security vulnerabilities. Since memory entries may influence agent behavior in subsequent runs or when accessed by other agents, memory poisoning introduces systemic risk. | ||
Josh-Beck marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| **Common Examples of Vulnerability:** | ||
|
|
||
| 1. Example 1: Specific instance or type of this vulnerability. | ||
| 2. Example 2: Another instance or type of this vulnerability. | ||
| 3. Example 3: Yet another instance or type of this vulnerability. | ||
| 1. RAG Poisoning: RAG Poisoning occurs when data which shouldn’t be used for desired actions or queries is inserted into the vector database. This could happen via: | ||
Josh-Beck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - Poisoning input data streams, such as creating false/misleading information in an online wiki. If that information is scraped and collected into the RAG system, the poisoned data would be included. | ||
| - Maliciously uploading files into the vector database directly. This could occur if attackers gain undesired access to the database. | ||
| - Providing excessive trust to input data pipelines. This could look like user documents being uploaded as part of normal usage. If user documents are not properly scanned or sanitized, attackers could choose to upload malicious documents which wouldn’t be caught and removed. | ||
|
|
||
| RAG poisoning has wide reaching impacts, from delivering false information with a higher degree of authority due to having resources to back up claims, to attacking individual users’ LLM interactions with specifically crafted payloads to target their unique exchanges. | ||
|
|
||
| 2. Shared User Context Poisoning: Any LLM system which saves contexts between runs, or uses a shared context window for multiple user interactions, can fall victim to Shared User Context Poisoning. This attack is very straightforward, letting attackers influence the behavior of an LLM during their session, and that influence leaking into subsequent sessions. | ||
Josh-Beck marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| This attack type has far-reaching consequences depending on how the LLM system is connected, such as poor output performance or spreading misinformation if subsequent users are directly chatting with an LLM, to code execution attacks if the LLM is being used inside of a tool calling or code writing system. | ||
Josh-Beck marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| 3. Systemic Misalignment and Backdoors: Memory poisoning can have more subtle and severe consequences than simply producing wrong results. A poisoned LLM can take on a new, malicious persona, deviating from its intended purpose. Attackers can also use this technique to install a backdoor, such as a secret instruction that remains inactive until a certain trigger phrase is entered. When the LLM encounters this sentence, it carries out the disguised malicious instructions, such as producing destructive code or transmitting sensitive data. | ||
Josh-Beck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| 4. Cascading failures and data exfiltration: A single poisoned memory entry in a sophisticated, multi-agent system (MAS) might have a domino effect, resulting in cascading failure. One agent may retrieve damaged data and then share it with others, leading the system to become unstable. Malicious instructions can also be placed in the memory as persistence instructions, allowing the LLM to access and communicate sensitive user or enterprise data to an attacker. This data exfiltration poses a significant risk since the model might be allowed valid access to data repositories but then altered to use that access maliciously. | ||
|
|
||
Josh-Beck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| **How to Prevent:** | ||
|
|
||
| 1. Prevention Step 1: A step or strategy that can be used to prevent the vulnerability or mitigate its effects. | ||
| 2. Prevention Step 2: Another prevention step or strategy. | ||
| 3. Prevention Step 3: Yet another prevention step or strategy. | ||
| **how to Prevent:** | ||
|
|
||
| **Example Attack Scenarios:** | ||
Josh-Beck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Preventing ASI06 requires a multi-layered approach to secure and validate an LLM's memory. Key strategies include: | ||
Josh-Beck marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| Scenario #1: A detailed scenario illustrating how an attacker could potentially exploit this vulnerability, including the attacker's actions and the potential outcomes. | ||
| **Content Validation:** Scan all new memory insertions for anomalies or malicious content before they are committed. | ||
Josh-Beck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| * Use AI-based scanners like Microsoft Presidio for PII detection and input sanitization. | ||
Josh-Beck marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| * Leverage adversarial testing frameworks such as PyRIT and Garak. | ||
Josh-Beck marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| Scenario #2: Another example of an attack scenario showing a different way the vulnerability could be exploited. | ||
| **Memory Segmentation:** Isolate memory access using session isolation to prevent "knowledge leakage" across different users. | ||
| **Access Control & Retention Policies:** | ||
| * Limit access to trusted sources only. | ||
Josh-Beck marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| * Apply context-aware policies so an agent only accesses memory relevant to its current task. | ||
| * Limit retention durations based on data sensitivity to reduce long-term risk. | ||
Josh-Beck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| * Implement provenance tracking (e.g., using TruLens or LangSmith traces) | ||
| **Knowledge Provenance & Anomaly Detection:** | ||
| * Require source attribution for all memory updates to trace where knowledge originated. | ||
| * Track AI knowledge lineage to understand how the memory evolved. | ||
Josh-Beck marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| * Deploy anomaly detection to identify suspicious memory updates or abnormal update frequencies. | ||
| **Resilience & Verification:** | ||
| * Use rollback and snapshot mechanisms to revert to a previous state after an anomaly is detected. | ||
| * Implement probabilistic truth-checking to verify new knowledge against trusted, verified source (e.g., using tools as Google Fact Check API) | ||
Josh-Beck marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| * Use version control for memory updates to support auditing, rollback, and tamper detection. | ||
|
|
||
Josh-Beck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| **Reference Links:** | ||
Josh-Beck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| 1. [Link Title](URL): Brief description of the reference link. | ||
| 2. [Link Title](URL): Brief description of the reference link. | ||
| References | ||
| [PoisonedRAG](https://arxiv.org/pdf/2402.07867) | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.