Agent Pattern

Memory

Persistent state across sessions: short-term, long-term, and semantic memory.

Intermediate Evolves from: Prompt Chaining →

Memory — Overview

The memory pattern enables an agent to persist information across conversations, building context over time. Short-term memory maintains state within a session; long-term memory stores and retrieves information across sessions using external storage.

Evolves from: Prompt Chaining — adds conversation state management, summarization, and persistent retrieval.

Architecture

Figure: The agent reads from both short-term (session) and long-term (persistent) memory. When short-term memory overflows the context window, a summarizer compresses it. Important information is stored in long-term memory for future sessions.

How It Works

Receive input — A new user message arrives.
Retrieve context — The agent queries long-term memory for relevant past information and reads recent short-term memory (conversation history).
Reason and respond — The agent processes the input with the retrieved context and generates a response.
Update short-term memory — The new exchange (input + response) is added to the session history.
Store to long-term memory — Important information, decisions, or facts are extracted and stored persistently.
Compress if needed — When the session history exceeds the context window, a summarizer compresses older messages into a summary.

Minimal Example

A coding assistant that remembers language preferences and project context across sessions.

from patterns.memory.code.python.memory_agent import MemoryAgent

agent = MemoryAgent(
    llm=your_llm,
    system="You are a personal coding assistant that adapts to each developer's preferences.",
)

# Session 1 — user provides context
agent.chat("I mostly work in TypeScript and I'm building a SaaS dashboard.")
agent.chat("I prefer React Query for data fetching and Zod for validation.")

print(agent.memory_snapshot)
# {'user_language': 'TypeScript', 'project_type': 'SaaS dashboard',
#  'prefers_react_query': 'true', 'prefers_zod': 'true'}

# Session 2 (new MemoryAgent instance, same LongTermStore) — memory is recalled
response = agent.chat("How should I handle form validation in my project?")
# Agent recalls TypeScript + Zod preference without being told again
# and tailors the response accordingly

Full implementation: [`code/python/memory_agent.py`](code/python/memory_agent.py)

Input / Output

Input: User message + retrieved context from both memory types
Output: Response informed by current and past interactions
Short-term store: Recent conversation turns (message list)
Long-term store: Persistent facts, preferences, decisions (vector store, database, or file)

Key Tradeoffs

Strength	Limitation
Enables multi-session continuity	Storage and retrieval add complexity
Personalizes responses over time	Memory retrieval quality affects response quality
Handles conversations exceeding context window	Summarization can lose important details
Agents can learn from past interactions	Stale or incorrect memories can mislead the agent
More natural, human-like interaction	Memory management (what to store, what to forget) is hard

When to Use

Multi-turn conversations that span sessions
Personal assistants that should remember user preferences
Agents that need to learn from past interactions
When conversation history exceeds the context window
Tasks that build on previous work (iterative document editing, ongoing research)

When NOT to Use

Single-turn interactions — no memory needed
When all context fits in one prompt — don't add overhead
When privacy requirements prevent storing conversation data
Stateless processing tasks (classification, extraction)

Evolves from: Prompt Chaining — see evolution.md
Combines with: ReAct (agent loop + memory), RAG (long-term memory can use the same vector store), Multi-Agent (shared memory between agents)
Related to: RAG — RAG retrieves from a document store; Memory retrieves from interaction history. The retrieval mechanism is similar but the data source is different.

Deeper Dive

Design — Memory types, storage strategies, retrieval patterns, summarization, forgetting policies
Implementation — Pseudocode, context window management, vector store integration, testing
Evolution — How memory evolves from prompt chaining