Memory
Persistent state across sessions: short-term, long-term, and semantic memory.
Memory — Overview
The memory pattern enables an agent to persist information across conversations, building context over time. Short-term memory maintains state within a session; long-term memory stores and retrieves information across sessions using external storage.
Evolves from: Prompt Chaining — adds conversation state management, summarization, and persistent retrieval.
Architecture
Figure: The agent reads from both short-term (session) and long-term (persistent) memory. When short-term memory overflows the context window, a summarizer compresses it. Important information is stored in long-term memory for future sessions.
How It Works
- Receive input — A new user message arrives.
- Retrieve context — The agent queries long-term memory for relevant past information and reads recent short-term memory (conversation history).
- Reason and respond — The agent processes the input with the retrieved context and generates a response.
- Update short-term memory — The new exchange (input + response) is added to the session history.
- Store to long-term memory — Important information, decisions, or facts are extracted and stored persistently.
- Compress if needed — When the session history exceeds the context window, a summarizer compresses older messages into a summary.
Minimal Example
A coding assistant that remembers language preferences and project context across sessions.
from patterns.memory.code.python.memory_agent import MemoryAgent
agent = MemoryAgent(
llm=your_llm,
system="You are a personal coding assistant that adapts to each developer's preferences.",
)
# Session 1 — user provides context
agent.chat("I mostly work in TypeScript and I'm building a SaaS dashboard.")
agent.chat("I prefer React Query for data fetching and Zod for validation.")
print(agent.memory_snapshot)
# {'user_language': 'TypeScript', 'project_type': 'SaaS dashboard',
# 'prefers_react_query': 'true', 'prefers_zod': 'true'}
# Session 2 (new MemoryAgent instance, same LongTermStore) — memory is recalled
response = agent.chat("How should I handle form validation in my project?")
# Agent recalls TypeScript + Zod preference without being told again
# and tailors the response accordingly
Full implementation: [`code/python/memory_agent.py`](code/python/memory_agent.py)
Input / Output
- Input: User message + retrieved context from both memory types
- Output: Response informed by current and past interactions
- Short-term store: Recent conversation turns (message list)
- Long-term store: Persistent facts, preferences, decisions (vector store, database, or file)
Key Tradeoffs
| Strength | Limitation |
|---|---|
| Enables multi-session continuity | Storage and retrieval add complexity |
| Personalizes responses over time | Memory retrieval quality affects response quality |
| Handles conversations exceeding context window | Summarization can lose important details |
| Agents can learn from past interactions | Stale or incorrect memories can mislead the agent |
| More natural, human-like interaction | Memory management (what to store, what to forget) is hard |
When to Use
- Multi-turn conversations that span sessions
- Personal assistants that should remember user preferences
- Agents that need to learn from past interactions
- When conversation history exceeds the context window
- Tasks that build on previous work (iterative document editing, ongoing research)
When NOT to Use
- Single-turn interactions — no memory needed
- When all context fits in one prompt — don't add overhead
- When privacy requirements prevent storing conversation data
- Stateless processing tasks (classification, extraction)
Related Patterns
- Evolves from: Prompt Chaining — see evolution.md
- Combines with: ReAct (agent loop + memory), RAG (long-term memory can use the same vector store), Multi-Agent (shared memory between agents)
- Related to: RAG — RAG retrieves from a document store; Memory retrieves from interaction history. The retrieval mechanism is similar but the data source is different.
Deeper Dive
- Design — Memory types, storage strategies, retrieval patterns, summarization, forgetting policies
- Implementation — Pseudocode, context window management, vector store integration, testing
- Evolution — How memory evolves from prompt chaining