Family Guides

AWS2-CTX: Context, Memory, And Instruction Boundary Control

AWS2-CTX is about controlling the information that can steer an agent's behavior.

In ordinary business terms, this family asks: what instructions, documents, memories, search results, tool outputs, chat history, and handoff notes can the agent read; which of those sources is allowed to set policy; what should never be remembered or exported; and how does the organization know that an untrusted document did not quietly change what the agent was supposed to do?

This matters because agents do not only follow the latest user message. They may combine system instructions, project rules, skill instructions, retrieved documents, previous conversations, memory records, tool output, search results, and external content. If the trust order is unclear, a malicious document can tell the agent to ignore approvals, a stale memory can change a decision later, a tool output can become hidden instructions, or a handoff can leak private content into evidence. Reviewers need evidence that context sources are known, ranked, bounded, sanitized, and tested.

What This Family Covers

In scope:

  • Instructions, project rules, user prompts, skill instructions, system messages, memory, conversation history, retrieved documents, search results, tool outputs, handoff notes, summaries, vector or retrieval stores, external data, and other context sources that can influence agent behavior.
  • Trust and precedence relationships between context sources, including which sources may set policy, request actions, supply evidence, override prior context, or only provide data.
  • Places where secrets, credentials, confidential data, private operational details, untrusted instructions, hidden prompt content, or unnecessary private content should not be stored, remembered, retrieved, summarized, or exported.
  • Controls that stop lower-trust content from silently overriding higher-priority instructions, runtime policy, approval requirements, workspace boundaries, or security rules.
  • Memory and durable-context write controls, including who or what may write persistent context, which workflows may use it, how it is reviewed, how long it is retained, and how material changes are attributed.
  • Sanitization of handoffs, summaries, memory records, and evidence exports so they remain useful without copying raw secrets, confidential payloads, session material, hidden instructions, or unnecessary private content.
  • Tests for instruction-boundary failures, context poisoning, retrieval poisoning, indirect prompt injection, tool-output poisoning, and memory interactions in high-impact workflows.
  • Reviewable records of material changes to memory, retrieval corpora, context sources, instruction sources, or trust relationships that can affect high-impact workflows.
  • Isolation or clean-context modes for high-risk workflows where lower-trust memory, retrieval corpora, shared context, or stale handoff state should not influence action review unless explicitly approved.

Out of scope:

  • Deciding the complete system boundary, business purpose, owner map, and inventory of scoped systems. That belongs mostly to AWS2-SCP.
  • Deciding which reusable skills, tools, connectors, prompt packs, packages, or supplier components are trusted sources. That belongs mostly to AWS2-SRC, though those components may introduce context risks.
  • Enforcing allow, deny, approval, interruption, rollback, or budget decisions for actions. That belongs mostly to AWS2-RUN.
  • Workspace sandboxing, filesystem boundaries, network egress, endpoint controls, and execution boundaries. Those belong mostly to AWS2-WSB.
  • Secret and sensitive-data handling as a complete data-protection program. That belongs mostly to AWS2-SEC, though AWS2-CTX identifies places where sensitive material should not be stored or exported as context.
  • Complete log-retention, receipt integrity, or audit-trace design. That belongs mostly to AWS2-LOG, though context-change and boundary-test evidence should be retained.
  • Full validation program design, red-team method, or finding lifecycle. That belongs mostly to AWS2-VAL, though this family names the context-specific tests that should exist for high-impact workflows.
  • Legal review of prohibited practices, transparency duties, data protection, workplace monitoring, or biometric rules.

Level Summary

Levels are cumulative. Level 2 builds on Level 1, and Level 3 builds on both.

LevelPlain-language meaningWhy this level existsTypical evidence
Level 1The organization knows which context and instruction sources can steer agents, which sources are more trusted, and which places must not hold sensitive or unsafe content.Context cannot be protected until reviewers know what the agent reads, remembers, retrieves, and treats as instructions.Context-source inventory, instruction precedence rules, prohibited-storage list, redaction policy.
Level 2Production use has controls for lower-trust override, memory writes, durable context, and sanitized handoffs or evidence exports.Managed production workflows need repeatable controls so untrusted content cannot silently change policy or persist unsafe state for later actions.Runtime policy, memory write policy, memory change receipt, sanitized handoff examples, evidence export review.
Level 3High-impact workflows are tested for context-boundary attacks, retain records of material context changes, and can be isolated from lower-trust context.High-impact workflows need stronger assurance that context attacks are tested, context changes are attributable, and review can happen in a clean context.Prompt-injection tests, retrieval-poisoning tests, durable context change records, clean-context configuration, isolation test results.

Candidate Controls

AWS2-CTX-L1-001: Context And Instruction Source Inventory Level 1

Requirement summary

Identify the context and instruction sources that can influence agent behavior, including user instructions, project instructions, skill instructions, retrieved documents, memory, and tool outputs where applicable. Distinguish trusted, user-provided, retrieved, external, generated, and lower-trust sources where practical.

Why it exists

Agents may treat many kinds of text or data as useful input. A reviewer needs to know which sources can affect agent behavior before deciding which ones can set rules, which ones only provide facts, which ones need sanitization, and which ones are risky enough to test.

Why this level

This belongs at Level 1 because source visibility is the foundation. Identifying sources does not prove the agent handles them safely, but it gives reviewers the map needed for precedence rules, memory controls, tests, and clean-context boundaries.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Context-source inventoryRuntime platform ownerBefore production use and after context-source changesInstruction sources, memory sources, retrieval sources, tool-output sources, generated summaries, handoffs, and external data that can influence the agentIdentifies likely sources; does not prove all hidden model or supplier context is visible.
Runtime context mapRuntime platform owner with workspace owner inputBefore review and after runtime configuration changesHow user prompts, project rules, skill instructions, retrieval results, memory, and tool outputs enter the workflowSupports review of context flow; does not prove the runtime enforces trust order.
Trust classification notesGovernance owner with runtime and evidence owner inputDuring initial scope review and periodic reviewWhich sources are trusted, user-provided, generated, retrieved, external, lower-trust, or unknownSupports risk classification; does not prove lower-trust sources cannot influence behavior.

AWS2-CTX-L1-002: Instruction Precedence And Trust Rules Level 1

Requirement summary

Define the intended precedence or trust relationship between instruction sources that can conflict, including which sources may set policy, request actions, provide evidence, or only supply data.

Why it exists

A retrieved document, tool output, or chat message can contain text that looks like an instruction. Without precedence rules, the agent or reviewer may not know whether to follow project policy, runtime policy, a user request, a skill instruction, a document instruction, or a stale memory.

Why this level

This belongs at Level 1 because every later control depends on knowing the intended trust order. The rule can be documented before the organization has full automated enforcement.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Instruction precedence policyRuntime platform owner or governance ownerBefore production use and after precedence changesWhich instruction sources outrank others, which sources can set policy, and which sources only provide dataDefines intent; does not prove runtime enforcement.
Conflict-handling examplesRuntime platform owner with evidence owner inputDuring design review and validation planningExpected behavior when user content, retrieved content, tool output, memory, or project rules conflictSupports reviewer understanding; does not prove every conflict type is covered.
Policy-to-runtime mappingRuntime platform ownerBefore production use and after runtime changesHow precedence expectations appear in runtime settings, prompts, policies, middleware, or review proceduresSupports implementation review; does not prove the model will always follow the rules.

AWS2-CTX-L1-003: Prohibited Context Storage Locations Level 1

Requirement summary

Identify context locations where secrets, credentials, confidential data, or private operational details should not be stored, retrieved, summarized into memory, exported as evidence, or used as examples.

Why it exists

Context often gets copied. A secret can move from a file into a prompt, from a prompt into memory, from memory into a handoff, or from a handoff into an evidence packet. Reviewers need a clear list of places where sensitive or unsafe content should not go.

Why this level

This belongs at Level 1 because it is a basic boundary statement. The organization can name prohibited storage and export locations before it has complete automated scanning, redaction, or enforcement.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Prohibited-storage policyWorkspace or endpoint owner with runtime owner inputBefore production use and after data-flow changesWhere secrets, credentials, confidential data, hidden prompt content, and private operational details must not be stored, retrieved, remembered, or exportedStates expected handling; does not prove all content is detected or removed.
Context storage-location inventoryRuntime platform owner or evidence ownerBefore review and after memory, retrieval, or export changesMemory stores, vector stores, logs, summaries, handoffs, evidence exports, and examples that may hold contextIdentifies storage points; does not prove the stored content is safe.
Sanitized example reviewEvidence ownerDuring evidence preparation and periodic samplingExample handoffs, summaries, memory records, or exports with sensitive values removed or replaced by safe placeholdersSupports redaction review; does not prove every historical record is sanitized.

AWS2-CTX-L2-001: Lower-Trust Override Controls Level 2

Requirement summary

Enforce or document controls that prevent lower-trust content, retrieved content, tool output, or user-provided documents from silently overriding higher-priority instructions or approval requirements, including human-approval, runtime-policy, and boundary requirements.

Why it exists

Lower-trust content can contain instructions such as "ignore previous rules", "send this file externally", or "approval is no longer required". Production workflows need controls so these instructions cannot quietly bypass policy, approval gates, or workspace boundaries.

Why this level

This belongs at Level 2 because managed production use should have repeatable prevention, mediation, or documented compensating controls. Level 1 defines the intended trust order; Level 2 expects the organization to protect that order in real workflows.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Runtime policy or middleware controlRuntime platform ownerBefore production use and after policy changesRules, middleware, prompts, or guardrails that keep lower-trust content from overriding approval, boundary, or security requirementsSupports enforcement review; does not prove prompt-injection immunity.
Lower-trust override testEvidence or audit owner with runtime owner inputBefore production use and during periodic validationScenario, lower-trust content, expected denial or escalation, actual result, finding, and remediationTests selected paths; does not prove all override attacks fail.
Approval-preservation reviewGovernance owner or evidence ownerDuring workflow review and after approval-rule changesThat lower-trust context cannot remove, weaken, or self-approve required human approval or runtime policy gatesSupports review of approval integrity; does not prove all approval paths are correctly configured.

AWS2-CTX-L2-002: Memory And Durable Context Write Control Level 2

Requirement summary

Control memory or durable context writes that could affect future high-impact actions, including approval, review, owner expectations, retention, deletion, and change-attribution expectations for persistent changes.

Why it exists

Memory can make a temporary instruction durable. A wrong owner assumption, a false approval note, a private detail, or a poisoned retrieval hint can influence later work after the original conversation is forgotten. Production workflows need rules for what may be written, who or what approves it, how long it stays, and how it can be corrected.

Why this level

This belongs at Level 2 because durable memory is a production-state change. Level 1 identifies memory as a context source; Level 2 expects controls around writes that could affect later high-impact actions.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Memory write policyRuntime platform owner with governance inputBefore enabling durable memory and after memory-policy changesWho or what may write memory, which workflows may use durable context, approval expectations, retention, deletion, and review rulesDefines memory governance; does not prove every write follows the rule.
Memory change receiptRuntime platform owner or evidence ownerDuring operation and during review samplingActor or runtime, timestamp, source, workflow, reason, affected memory or context record, and review status where practicalSupports attribution; does not prove the memory content is true or harmless.
Retention and deletion reviewEvidence owner with runtime owner inputDuring periodic review or when workflows are retiredWhether durable context records still have a valid purpose, owner, retention basis, and deletion pathSupports lifecycle review; does not prove all copies were removed from every system.

AWS2-CTX-L2-003: Sanitized Handoffs, Summaries, Memory, And Evidence Exports Level 2

Requirement summary

Sanitize handoffs, summaries, memory records, and evidence exports to avoid storing secrets, credentials, session cookies, confidential payloads, untrusted instructions, hidden prompt content, or unnecessary private content.

Why it exists

Handoffs and evidence packets are meant to help humans or later agents continue work. They become risky when they copy raw secrets, private payloads, full prompt internals, hidden instructions, or untrusted content that later agents might treat as commands.

Why this level

This belongs at Level 2 because managed production evidence should be useful and reviewable without expanding exposure. Level 1 names prohibited storage locations; Level 2 expects repeatable sanitization for durable records and exports.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Handoff or evidence sanitization checklistEvidence owner with runtime owner inputBefore external review, audit packet creation, or workflow handoffRequired redactions, prohibited content types, summary boundaries, and reviewer responsibilitiesSupports consistent sanitization; does not prove every sensitive value was detected.
Sanitized handoff sampleEvidence owner or workflow ownerDuring workflow handoff and review samplingUseful task state, decisions, file references, and next steps without raw secrets, hidden instructions, or unnecessary private payloadsDemonstrates selected examples; does not prove all handoffs are safe.
Evidence export review logEvidence or audit ownerBefore sharing evidence internally for review and after export process changesExport scope, reviewer, redaction outcome, withheld material, and rationale for included contextSupports export accountability; does not prove external sharing is legally sufficient.

AWS2-CTX-L3-001: Instruction-Boundary And Context-Poisoning Tests Level 3

Requirement summary

Test instruction-boundary and context-poisoning scenarios for high-impact workflows, including untrusted documents, retrieved content, tool outputs, memory interactions, skill instructions, external data, and poisoned retrieval records.

Why it exists

Documented rules are not enough for high-impact workflows. The organization needs to test whether the agent resists realistic context attacks, such as indirect prompt injection in a document, poisoned retrieval results, malicious tool output, stale memory, or instructions hidden in external data.

Why this level

This belongs at Level 3 because it adds stronger assurance through testing. It is more demanding than documenting sources and controls, and it should focus on workflows where context failure could cause significant harm.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Instruction-boundary test summaryEvidence or audit owner with runtime owner inputBefore high-impact production use and during recurring validationTest cases, expected behavior, actual behavior, findings, remediation, and retest statusTests selected scenarios; does not prove prompt-injection immunity.
Retrieval-poisoning or context-poisoning testEvidence or audit owner with retrieval owner inputBefore using retrieval for high-impact workflows and after retrieval changesPoisoned document or record scenario, retrieval path, policy outcome, finding, and remediationSupports selected retrieval-risk review; does not prove all corpus poisoning is prevented.
Tool-output poisoning testEvidence or audit owner with tool owner inputDuring high-impact workflow validationWhether malicious or misleading tool output can override instructions, approvals, or boundariesTests selected tool paths; does not prove every tool output is trustworthy.

AWS2-CTX-L3-002: Material Context Change Records Level 3

Requirement summary

Retain reviewable records of material memory, retrieval, context, or instruction changes that can influence high-impact workflows, including actor, source, timestamp, rationale, and review status where practical.

Why it exists

High-impact workflows can change because a memory was edited, a retrieval corpus was updated, a project instruction changed, a new external data source was added, or a handoff became canonical context. Reviewers need to know what changed, who or what changed it, why, and whether it was reviewed.

Why this level

This belongs at Level 3 because stronger assurance requires durable, reviewable history for material context changes, not only current-state configuration.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Durable context change logRuntime platform owner or evidence ownerDuring operation and before high-impact reviewActor or runtime, timestamp, source, context object, rationale, review status, and affected workflow where practicalSupports change traceability; does not prove the changed content is safe.
Retrieval corpus change recordRetrieval or knowledge-base ownerWhen retrieval sources are added, removed, reindexed, or materially changedSource, change type, affected corpus, owner, review status, and rollback or correction pathSupports retrieval-change review; does not prove retrieved answers are correct.
Instruction-source review recordGovernance owner with runtime owner inputWhen project, system, skill, policy, or workflow instructions materially changeChanged instruction source, reason, approver or reviewer, affected workflows, and effective dateSupports instruction-change accountability; does not prove the model will always follow the changed instruction.

AWS2-CTX-L3-003: High-Risk Workflow Context Isolation Level 3

Requirement summary

Isolate high-risk workflows from lower-trust memory, retrieval corpora, or shared context unless the lower-trust source is explicitly approved for the workflow, and provide a clean context mode or equivalent boundary for high-impact action review where practical.

Why it exists

Some workflows should not inherit messy context. A high-impact review can be distorted by stale memory, unrelated chat history, broad retrieval corpora, external pages, or shared context from another matter. Clean context makes it easier to review the decision path and reduces the chance that lower-trust state affects a sensitive action.

Why this level

This belongs at Level 3 because it asks for stronger separation around high-risk workflows. It may require runtime features, operating procedures, or review discipline beyond ordinary production controls.

Evidence examples

EvidenceLikely owner/providerWhen collectedWhat it should showClaim limit
Clean-context mode configuration or procedureRuntime platform owner with workflow owner inputBefore high-risk workflow use and after runtime changesHow memory, retrieval, chat history, external content, and shared context are limited or reset for high-impact reviewSupports isolation review; does not prove all hidden context is absent.
Approved context-source list for high-risk workflowGovernance owner with runtime and workflow owner inputBefore workflow approval and during periodic reviewWhich memory stores, retrieval corpora, documents, external sources, or handoffs are approved for the workflowSupports source approval; does not prove approved sources are accurate or safe.
Context-isolation test resultEvidence or audit ownerBefore high-impact production use and during recurring validationWhether lower-trust memory, retrieval records, or unrelated shared context can influence the high-risk workflowTests selected isolation paths; does not prove all cross-context leakage is impossible.

External Mapping Notes

The completed crosswalk treats AWS2-CTX as a candidate-control family shaped by instruction hierarchy, memory and vector-store security, RAG and data-flow threat modeling, prompt injection, context poisoning, tool-output poisoning, privacy, information integrity, and goal-drift signals.

Relevant source signals include:

  • EU AI Act official sources: prohibited-practice, workplace-use, and disclosure signals inform boundary tests and prohibited-use records, but do not make AWS2-CTX a legal-compliance control.
  • OWASP AISVS: memory, vector, and autonomous orchestration signals support testable context-handling expectations, while the public AISVS status remains early and not settled certification language.
  • CSA MAESTRO: data poisoning, RAG risks, tampering, and exfiltration support threat-modeling and context-risk review.
  • NIST AI 600-1: privacy, information-integrity, confabulation, and component risk signals support context-source inventories and retrieval validation, but enforcement evidence must come from the actual runtime and workspace.
  • ISO/IEC 23894: context-customized AI risk-management guidance supports risk assessment and treatment notes, based only on public high-level source descriptions available in the current crosswalk.
  • Five Eyes agentic AI guidance: indirect prompt injection, memory interaction, and goal-drift signals support prompt-injection tests, memory interaction logs, and adoption gates.
  • MITRE ATLAS: prompt injection, context poisoning, RAG poisoning, and tool data poisoning support scenario design for validation and red-team work.

These mappings are informative. They support evidence for selected candidate controls and scenario design, but they do not prove prompt-injection immunity, legal compliance, external-framework conformance, or complete model robustness.

Use this guide with the formal AWS2-CTX candidate requirements. If the guide and the standard draft disagree, the standard draft controls.