aws2 Documentation

AWS2-CTX is about controlling the information that can steer an agent's behavior.

In ordinary business terms, this family asks: what instructions, documents, memories, search results, tool outputs, chat history, and handoff notes can the agent read; which of those sources is allowed to set policy; what should never be remembered or exported; and how does the organization know that an untrusted document did not quietly change what the agent was supposed to do?

This matters because agents do not only follow the latest user message. They may combine system instructions, project rules, skill instructions, retrieved documents, previous conversations, memory records, tool output, search results, and external content. If the trust order is unclear, a malicious document can tell the agent to ignore approvals, a stale memory can change a decision later, a tool output can become hidden instructions, or a handoff can leak private content into evidence. Reviewers need evidence that context sources are known, ranked, bounded, sanitized, and tested.

What This Family Covers

In scope:

Instructions, project rules, user prompts, skill instructions, system messages, memory, conversation history, retrieved documents, search results, tool outputs, handoff notes, summaries, vector or retrieval stores, external data, and other context sources that can influence agent behavior.
Trust and precedence relationships between context sources, including which sources may set policy, request actions, supply evidence, override prior context, or only provide data.
Places where secrets, credentials, confidential data, private operational details, untrusted instructions, hidden prompt content, or unnecessary private content should not be stored, remembered, retrieved, summarized, or exported.
Controls that stop lower-trust content from silently overriding higher-priority instructions, runtime policy, approval requirements, workspace boundaries, or security rules.
Memory and durable-context write controls, including who or what may write persistent context, which workflows may use it, how it is reviewed, how long it is retained, and how material changes are attributed.
Sanitization of handoffs, summaries, memory records, and evidence exports so they remain useful without copying raw secrets, confidential payloads, session material, hidden instructions, or unnecessary private content.
Tests for instruction-boundary failures, context poisoning, retrieval poisoning, indirect prompt injection, tool-output poisoning, and memory interactions in high-impact workflows.
Reviewable records of material changes to memory, retrieval corpora, context sources, instruction sources, or trust relationships that can affect high-impact workflows.
Isolation or clean-context modes for high-risk workflows where lower-trust memory, retrieval corpora, shared context, or stale handoff state should not influence action review unless explicitly approved.

Out of scope:

Deciding the complete system boundary, business purpose, owner map, and inventory of scoped systems. That belongs mostly to AWS2-SCP.
Deciding which reusable skills, tools, connectors, prompt packs, packages, or supplier components are trusted sources. That belongs mostly to AWS2-SRC, though those components may introduce context risks.
Enforcing allow, deny, approval, interruption, rollback, or budget decisions for actions. That belongs mostly to AWS2-RUN.
Workspace sandboxing, filesystem boundaries, network egress, endpoint controls, and execution boundaries. Those belong mostly to AWS2-WSB.
Secret and sensitive-data handling as a complete data-protection program. That belongs mostly to AWS2-SEC, though AWS2-CTX identifies places where sensitive material should not be stored or exported as context.
Complete log-retention, receipt integrity, or audit-trace design. That belongs mostly to AWS2-LOG, though context-change and boundary-test evidence should be retained.
Full validation program design, red-team method, or finding lifecycle. That belongs mostly to AWS2-VAL, though this family names the context-specific tests that should exist for high-impact workflows.
Legal review of prohibited practices, transparency duties, data protection, workplace monitoring, or biometric rules.

Level Summary

Levels are cumulative. Level 2 builds on Level 1, and Level 3 builds on both.

Level	Plain-language meaning	Why this level exists	Typical evidence
Level 1	The organization knows which context and instruction sources can steer agents, which sources are more trusted, and which places must not hold sensitive or unsafe content.	Context cannot be protected until reviewers know what the agent reads, remembers, retrieves, and treats as instructions.	Context-source inventory, instruction precedence rules, prohibited-storage list, redaction policy.
Level 2	Production use has controls for lower-trust override, memory writes, durable context, and sanitized handoffs or evidence exports.	Managed production workflows need repeatable controls so untrusted content cannot silently change policy or persist unsafe state for later actions.	Runtime policy, memory write policy, memory change receipt, sanitized handoff examples, evidence export review.
Level 3	High-impact workflows are tested for context-boundary attacks, retain records of material context changes, and can be isolated from lower-trust context.	High-impact workflows need stronger assurance that context attacks are tested, context changes are attributable, and review can happen in a clean context.	Prompt-injection tests, retrieval-poisoning tests, durable context change records, clean-context configuration, isolation test results.

Candidate Controls

AWS2-CTX-L1-001: Context And Instruction Source Inventory Level 1

Requirement summary

Identify the context and instruction sources that can influence agent behavior, including user instructions, project instructions, skill instructions, retrieved documents, memory, and tool outputs where applicable. Distinguish trusted, user-provided, retrieved, external, generated, and lower-trust sources where practical.

Why it exists

Agents may treat many kinds of text or data as useful input. A reviewer needs to know which sources can affect agent behavior before deciding which ones can set rules, which ones only provide facts, which ones need sanitization, and which ones are risky enough to test.

Why this level

This belongs at Level 1 because source visibility is the foundation. Identifying sources does not prove the agent handles them safely, but it gives reviewers the map needed for precedence rules, memory controls, tests, and clean-context boundaries.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Context-source inventory	Runtime platform owner	Before production use and after context-source changes	Instruction sources, memory sources, retrieval sources, tool-output sources, generated summaries, handoffs, and external data that can influence the agent	Identifies likely sources; does not prove all hidden model or supplier context is visible.
Runtime context map	Runtime platform owner with workspace owner input	Before review and after runtime configuration changes	How user prompts, project rules, skill instructions, retrieval results, memory, and tool outputs enter the workflow	Supports review of context flow; does not prove the runtime enforces trust order.
Trust classification notes	Governance owner with runtime and evidence owner input	During initial scope review and periodic review	Which sources are trusted, user-provided, generated, retrieved, external, lower-trust, or unknown	Supports risk classification; does not prove lower-trust sources cannot influence behavior.

AWS2-CTX-L1-002: Instruction Precedence And Trust Rules Level 1

Requirement summary

Define the intended precedence or trust relationship between instruction sources that can conflict, including which sources may set policy, request actions, provide evidence, or only supply data.

Why it exists

A retrieved document, tool output, or chat message can contain text that looks like an instruction. Without precedence rules, the agent or reviewer may not know whether to follow project policy, runtime policy, a user request, a skill instruction, a document instruction, or a stale memory.

Why this level

This belongs at Level 1 because every later control depends on knowing the intended trust order. The rule can be documented before the organization has full automated enforcement.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Instruction precedence policy	Runtime platform owner or governance owner	Before production use and after precedence changes	Which instruction sources outrank others, which sources can set policy, and which sources only provide data	Defines intent; does not prove runtime enforcement.
Conflict-handling examples	Runtime platform owner with evidence owner input	During design review and validation planning	Expected behavior when user content, retrieved content, tool output, memory, or project rules conflict	Supports reviewer understanding; does not prove every conflict type is covered.
Policy-to-runtime mapping	Runtime platform owner	Before production use and after runtime changes	How precedence expectations appear in runtime settings, prompts, policies, middleware, or review procedures	Supports implementation review; does not prove the model will always follow the rules.

AWS2-CTX-L1-003: Prohibited Context Storage Locations Level 1

Requirement summary

Identify context locations where secrets, credentials, confidential data, or private operational details should not be stored, retrieved, summarized into memory, exported as evidence, or used as examples.

Why it exists

Context often gets copied. A secret can move from a file into a prompt, from a prompt into memory, from memory into a handoff, or from a handoff into an evidence packet. Reviewers need a clear list of places where sensitive or unsafe content should not go.

Why this level

This belongs at Level 1 because it is a basic boundary statement. The organization can name prohibited storage and export locations before it has complete automated scanning, redaction, or enforcement.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Prohibited-storage policy	Workspace or endpoint owner with runtime owner input	Before production use and after data-flow changes	Where secrets, credentials, confidential data, hidden prompt content, and private operational details must not be stored, retrieved, remembered, or exported	States expected handling; does not prove all content is detected or removed.
Context storage-location inventory	Runtime platform owner or evidence owner	Before review and after memory, retrieval, or export changes	Memory stores, vector stores, logs, summaries, handoffs, evidence exports, and examples that may hold context	Identifies storage points; does not prove the stored content is safe.
Sanitized example review	Evidence owner	During evidence preparation and periodic sampling	Example handoffs, summaries, memory records, or exports with sensitive values removed or replaced by safe placeholders	Supports redaction review; does not prove every historical record is sanitized.

AWS2-CTX-L2-001: Lower-Trust Override Controls Level 2

Requirement summary

Enforce or document controls that prevent lower-trust content, retrieved content, tool output, or user-provided documents from silently overriding higher-priority instructions or approval requirements, including human-approval, runtime-policy, and boundary requirements.

Why it exists

Lower-trust content can contain instructions such as "ignore previous rules", "send this file externally", or "approval is no longer required". Production workflows need controls so these instructions cannot quietly bypass policy, approval gates, or workspace boundaries.

Why this level

This belongs at Level 2 because managed production use should have repeatable prevention, mediation, or documented compensating controls. Level 1 defines the intended trust order; Level 2 expects the organization to protect that order in real workflows.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Runtime policy or middleware control	Runtime platform owner	Before production use and after policy changes	Rules, middleware, prompts, or guardrails that keep lower-trust content from overriding approval, boundary, or security requirements	Supports enforcement review; does not prove prompt-injection immunity.
Lower-trust override test	Evidence or audit owner with runtime owner input	Before production use and during periodic validation	Scenario, lower-trust content, expected denial or escalation, actual result, finding, and remediation	Tests selected paths; does not prove all override attacks fail.
Approval-preservation review	Governance owner or evidence owner	During workflow review and after approval-rule changes	That lower-trust context cannot remove, weaken, or self-approve required human approval or runtime policy gates	Supports review of approval integrity; does not prove all approval paths are correctly configured.

AWS2-CTX-L2-002: Memory And Durable Context Write Control Level 2

Requirement summary

Control memory or durable context writes that could affect future high-impact actions, including approval, review, owner expectations, retention, deletion, and change-attribution expectations for persistent changes.

Why it exists

Memory can make a temporary instruction durable. A wrong owner assumption, a false approval note, a private detail, or a poisoned retrieval hint can influence later work after the original conversation is forgotten. Production workflows need rules for what may be written, who or what approves it, how long it stays, and how it can be corrected.

Why this level

This belongs at Level 2 because durable memory is a production-state change. Level 1 identifies memory as a context source; Level 2 expects controls around writes that could affect later high-impact actions.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Memory write policy	Runtime platform owner with governance input	Before enabling durable memory and after memory-policy changes	Who or what may write memory, which workflows may use durable context, approval expectations, retention, deletion, and review rules	Defines memory governance; does not prove every write follows the rule.
Memory change receipt	Runtime platform owner or evidence owner	During operation and during review sampling	Actor or runtime, timestamp, source, workflow, reason, affected memory or context record, and review status where practical	Supports attribution; does not prove the memory content is true or harmless.
Retention and deletion review	Evidence owner with runtime owner input	During periodic review or when workflows are retired	Whether durable context records still have a valid purpose, owner, retention basis, and deletion path	Supports lifecycle review; does not prove all copies were removed from every system.

AWS2-CTX-L2-003: Sanitized Handoffs, Summaries, Memory, And Evidence Exports Level 2

Requirement summary

Sanitize handoffs, summaries, memory records, and evidence exports to avoid storing secrets, credentials, session cookies, confidential payloads, untrusted instructions, hidden prompt content, or unnecessary private content.

Why it exists

Handoffs and evidence packets are meant to help humans or later agents continue work. They become risky when they copy raw secrets, private payloads, full prompt internals, hidden instructions, or untrusted content that later agents might treat as commands.

Why this level

This belongs at Level 2 because managed production evidence should be useful and reviewable without expanding exposure. Level 1 names prohibited storage locations; Level 2 expects repeatable sanitization for durable records and exports.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Handoff or evidence sanitization checklist	Evidence owner with runtime owner input	Before external review, audit packet creation, or workflow handoff	Required redactions, prohibited content types, summary boundaries, and reviewer responsibilities	Supports consistent sanitization; does not prove every sensitive value was detected.
Sanitized handoff sample	Evidence owner or workflow owner	During workflow handoff and review sampling	Useful task state, decisions, file references, and next steps without raw secrets, hidden instructions, or unnecessary private payloads	Demonstrates selected examples; does not prove all handoffs are safe.
Evidence export review log	Evidence or audit owner	Before sharing evidence internally for review and after export process changes	Export scope, reviewer, redaction outcome, withheld material, and rationale for included context	Supports export accountability; does not prove external sharing is legally sufficient.

AWS2-CTX-L3-001: Instruction-Boundary And Context-Poisoning Tests Level 3

Requirement summary

Test instruction-boundary and context-poisoning scenarios for high-impact workflows, including untrusted documents, retrieved content, tool outputs, memory interactions, skill instructions, external data, and poisoned retrieval records.

Why it exists

Documented rules are not enough for high-impact workflows. The organization needs to test whether the agent resists realistic context attacks, such as indirect prompt injection in a document, poisoned retrieval results, malicious tool output, stale memory, or instructions hidden in external data.

Why this level

This belongs at Level 3 because it adds stronger assurance through testing. It is more demanding than documenting sources and controls, and it should focus on workflows where context failure could cause significant harm.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Instruction-boundary test summary	Evidence or audit owner with runtime owner input	Before high-impact production use and during recurring validation	Test cases, expected behavior, actual behavior, findings, remediation, and retest status	Tests selected scenarios; does not prove prompt-injection immunity.
Retrieval-poisoning or context-poisoning test	Evidence or audit owner with retrieval owner input	Before using retrieval for high-impact workflows and after retrieval changes	Poisoned document or record scenario, retrieval path, policy outcome, finding, and remediation	Supports selected retrieval-risk review; does not prove all corpus poisoning is prevented.
Tool-output poisoning test	Evidence or audit owner with tool owner input	During high-impact workflow validation	Whether malicious or misleading tool output can override instructions, approvals, or boundaries	Tests selected tool paths; does not prove every tool output is trustworthy.

AWS2-CTX-L3-002: Material Context Change Records Level 3

Requirement summary

Retain reviewable records of material memory, retrieval, context, or instruction changes that can influence high-impact workflows, including actor, source, timestamp, rationale, and review status where practical.

Why it exists

High-impact workflows can change because a memory was edited, a retrieval corpus was updated, a project instruction changed, a new external data source was added, or a handoff became canonical context. Reviewers need to know what changed, who or what changed it, why, and whether it was reviewed.

Why this level

This belongs at Level 3 because stronger assurance requires durable, reviewable history for material context changes, not only current-state configuration.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Durable context change log	Runtime platform owner or evidence owner	During operation and before high-impact review	Actor or runtime, timestamp, source, context object, rationale, review status, and affected workflow where practical	Supports change traceability; does not prove the changed content is safe.
Retrieval corpus change record	Retrieval or knowledge-base owner	When retrieval sources are added, removed, reindexed, or materially changed	Source, change type, affected corpus, owner, review status, and rollback or correction path	Supports retrieval-change review; does not prove retrieved answers are correct.
Instruction-source review record	Governance owner with runtime owner input	When project, system, skill, policy, or workflow instructions materially change	Changed instruction source, reason, approver or reviewer, affected workflows, and effective date	Supports instruction-change accountability; does not prove the model will always follow the changed instruction.

AWS2-CTX-L3-003: High-Risk Workflow Context Isolation Level 3

Requirement summary

Isolate high-risk workflows from lower-trust memory, retrieval corpora, or shared context unless the lower-trust source is explicitly approved for the workflow, and provide a clean context mode or equivalent boundary for high-impact action review where practical.

Why it exists

Some workflows should not inherit messy context. A high-impact review can be distorted by stale memory, unrelated chat history, broad retrieval corpora, external pages, or shared context from another matter. Clean context makes it easier to review the decision path and reduces the chance that lower-trust state affects a sensitive action.

Why this level

This belongs at Level 3 because it asks for stronger separation around high-risk workflows. It may require runtime features, operating procedures, or review discipline beyond ordinary production controls.

Evidence examples

Evidence	Likely owner/provider	When collected	What it should show	Claim limit
Clean-context mode configuration or procedure	Runtime platform owner with workflow owner input	Before high-risk workflow use and after runtime changes	How memory, retrieval, chat history, external content, and shared context are limited or reset for high-impact review	Supports isolation review; does not prove all hidden context is absent.
Approved context-source list for high-risk workflow	Governance owner with runtime and workflow owner input	Before workflow approval and during periodic review	Which memory stores, retrieval corpora, documents, external sources, or handoffs are approved for the workflow	Supports source approval; does not prove approved sources are accurate or safe.
Context-isolation test result	Evidence or audit owner	Before high-impact production use and during recurring validation	Whether lower-trust memory, retrieval records, or unrelated shared context can influence the high-risk workflow	Tests selected isolation paths; does not prove all cross-context leakage is impossible.

External Mapping Notes

The completed crosswalk treats AWS2-CTX as a candidate-control family shaped by instruction hierarchy, memory and vector-store security, RAG and data-flow threat modeling, prompt injection, context poisoning, tool-output poisoning, privacy, information integrity, and goal-drift signals.

Relevant source signals include:

EU AI Act official sources: prohibited-practice, workplace-use, and disclosure signals inform boundary tests and prohibited-use records, but do not make AWS2-CTX a legal-compliance control.
OWASP AISVS: memory, vector, and autonomous orchestration signals support testable context-handling expectations, while the public AISVS status remains early and not settled certification language.
CSA MAESTRO: data poisoning, RAG risks, tampering, and exfiltration support threat-modeling and context-risk review.
NIST AI 600-1: privacy, information-integrity, confabulation, and component risk signals support context-source inventories and retrieval validation, but enforcement evidence must come from the actual runtime and workspace.
ISO/IEC 23894: context-customized AI risk-management guidance supports risk assessment and treatment notes, based only on public high-level source descriptions available in the current crosswalk.
Five Eyes agentic AI guidance: indirect prompt injection, memory interaction, and goal-drift signals support prompt-injection tests, memory interaction logs, and adoption gates.
MITRE ATLAS: prompt injection, context poisoning, RAG poisoning, and tool data poisoning support scenario design for validation and red-team work.

These mappings are informative. They support evidence for selected candidate controls and scenario design, but they do not prove prompt-injection immunity, legal compliance, external-framework conformance, or complete model robustness.

Formal Standard Link

Use this guide with the formal AWS2-CTX candidate requirements. If the guide and the standard draft disagree, the standard draft controls.