Skip to main content

Thread Transfer

Long-term memory for AI agents: Mem0, LangGraph, and beyond

Mem0 achieves 26% improvement over OpenAI on memory benchmarks. Here's how memory systems work and when to use them.

Jorgo Bardho

Founder, Thread Transfer

March 23, 202512 min read
AI agent memoryMem0LangGraph memorypersistent context
AI memory architecture diagram

LLMs are stateless. Every time you restart a conversation, the agent forgets everything. For prototypes, that's fine—you can stuff chat history into the prompt. But as your AI product scales, you need a real memory system. One that persists facts, retrieves them efficiently, and doesn't blow your token budget. That's where tools like Mem0 and LangGraph come in.

Why memory matters

Without memory, your AI agent asks the same questions every session. It can't remember user preferences, prior decisions, or learned context. You end up with frustrated users and ballooning context windows as you paste in entire conversation histories just to fake continuity.

The alternative is a memory layer that stores semantic facts, retrieves relevant context on demand, and keeps the LLM focused on what matters right now. Research from Mem0 shows a 26% improvement over OpenAI's assistant API on memory-heavy benchmarks. That's not trivial.

Short-term vs long-term memory

Think of short-term memory as RAM—it holds the current session, recent messages, and immediate context. Long-term memory is your hard drive—it persists facts, patterns, and decisions across sessions.

  • Short-term: Last 5-10 messages, ephemeral state, lives in prompt or session store
  • Long-term: User preferences, past decisions, learned facts—persisted in a vector DB or graph

Most production agents need both. Short-term keeps the conversation coherent. Long-term makes the agent actually useful over time.

Memory tools compared

The landscape is crowded. Here's what matters:

  • Mem0: Open-source memory layer with adaptive memory, auto-tiering between short/long-term, and built-in vector storage. Integrates with LangChain, Autogen, and standalone agents. Best for teams that want a turnkey solution.
  • LangGraph: Part of the LangChain ecosystem. Builds stateful agents with explicit memory graphs. You define nodes (memory states) and edges (transitions). Great for complex workflows where you need fine control over what gets remembered and when.
  • Roll your own: Use a vector DB (Pinecone, Weaviate, Qdrant) plus a schema for memory entries. More work, but maximum flexibility. Common pattern: embed facts, retrieve top-k on each turn, inject into prompt.

Architecture patterns that work

Here's a production-grade memory stack:

  1. Capture: Extract facts, decisions, and preferences from user messages using a small LLM (e.g., GPT-4o-mini).
  2. Store: Write to a vector DB with metadata (user_id, session_id, timestamp, importance score).
  3. Retrieve: On each turn, query the DB for top-5 relevant facts. Use hybrid search (semantic + keyword) for best results.
  4. Inject: Add retrieved facts to the system prompt or as a "memory context" section.
  5. Decay: Downweight or delete stale facts. You don't want 6-month-old preferences poisoning current interactions.

Pro tip: Tag memory entries with confidence scores. Low-confidence facts should be verified before acting on them.

Implementation guide

If you're using Mem0, setup looks like this:

# Install
pip install mem0ai

# Initialize
from mem0 import Memory
memory = Memory()

# Add memory
memory.add("User prefers concise responses", user_id="user_123")

# Retrieve
context = memory.get_all(user_id="user_123")

For LangGraph, you define a state graph:

from langgraph.graph import StateGraph
graph = StateGraph()
graph.add_node("memory", memory_node)
graph.add_edge("memory", "agent")

Both approaches work. Mem0 is faster to prototype. LangGraph gives you more control over complex state machines.

Privacy and retention

Memory systems store sensitive data. You need policies for:

  • Retention: How long do memories live? 30 days? Forever? Match your legal requirements.
  • Deletion: Users must be able to delete their memory. Implement a "forget me" endpoint.
  • Access control: Memory for user A should never leak to user B. Enforce strict user_id scoping.
  • PII redaction: Strip credit cards, SSNs, and other PII before storing facts.

Testing your memory layer

Memory bugs are subtle. Test these scenarios:

  • User states a preference in session 1. Does session 2 remember it?
  • User updates a preference. Does the old one get overwritten or do you have duplicates?
  • Unrelated users. Does memory leak across accounts?
  • Stale facts. If a user changes their email, does the old one stick around?

When to skip memory

Not every agent needs this. Skip memory if:

  • Your agent is stateless by design (e.g., FAQ bot, one-shot summarizer)
  • You're in heavy prototyping mode and context windows are still cheap
  • Compliance doesn't allow long-term storage

Otherwise, a memory layer turns a chatbot into an assistant.

Next steps

Mem0 benchmarks and docs: mem0.ai
LangGraph memory guide: langchain.com
Questions? Email info@thread-transfer.com