( VS )

RetainDB vs Zep

RetainDB vs Zep: 88% preference recall vs 56.7%, without the graph complexity

Both products take memory seriously. But RetainDB handles three layers — user memory, context assembly, and knowledge base ingestion (Notion, PDFs, Confluence, YouTube) — not just memory. Zep goes deep on entity graphs and temporal reasoning. RetainDB goes deep on recall quality, knowledge ingestion, and production simplicity.

88% Preference recall

79% Overall memory score

0% Hallucination rate

<40ms Retrieval latency

88%

Preference recall

LongMemEval · RetainDB

79%

Overall memory score

LongMemEval · RetainDB

Hallucination rate

In benchmark testing · RetainDB

<40ms

Retrieval latency

Global average · RetainDB

TL;DR

Zep's temporal knowledge graph is genuinely impressive engineering. RetainDB scores 31 points higher on preference recall. Pick based on what your users will actually feel — not the architecture you find more interesting.

At a glance

RetainDB vs Zep

Feature

RetainDB

Zep

Preference recall (LongMemEval)

88%

56.7%

Overall score (LongMemEval)

79%

~72%

Retrieval latency

<40ms

Not published

Memory architecture

Hybrid vector + BM25 + reranking

Temporal knowledge graph (Graphiti)

Memory taxonomy

13 typed categories

Graph nodes with entity + relationship modeling

Memory scopes

6 dimensions

Graph nodes per user/session

Framework adapters

AI SDK, LangChain, LangGraph — drop-in

Python and Node SDKs

Time to first write

<30 min via wizard

Graph architecture planning required

Knowledge base ingestion

22 connectors — Notion, PDF, Confluence, YouTube, arXiv, Playwright, GitHub, GitLab, Discord, Slack

Not a feature — graph is for entity memory, not document ingestion

Context + knowledge in one call

Memory and KB results composed per query by type and scope

Memory only — document knowledge is a separate problem

The specifics

Why the difference matters

31 percentage points on the metric users feel

Preference recall measures whether your agent correctly remembers what a user told it. RetainDB: 88%. Zep: 56.7%. That gap means RetainDB correctly recalls a preference on roughly 1 in 3 queries that Zep misses. Both scores from the same LongMemEval task set — retaindb.com/benchmark.

Hybrid retrieval vs graph traversal

Zep's graph traversal is optimised for entity-relationship reasoning. For preference memory and conversational continuity, vector + BM25 + reranking achieves higher recall — the BM25 branch catches phrasing that embedding distance buries.

Under 30 minutes vs graph architecture planning

Run npx retaindb-wizard, it detects your framework, generates integration code. With Zep, graph architecture — node types, edge definitions, traversal strategy — has to be designed before you write a single memory.

Knowledge base is a separate problem for Zep — not for RetainDB

Zep's graph is built for entity memory extracted from conversations. If you also need your agents to know your product documentation, help center, research papers, or Notion workspace, that's a separate integration with Zep. RetainDB's 15+ built-in connectors (Notion, Confluence, PDF, YouTube, arXiv, Playwright, sitemaps) make knowledge and memory retrieval part of the same system — composed per query by type and scope.

Pick your fit

Who should use what

Choose RetainDB when

Preference recall gap matters — 88% vs 56.7% is 31 points

You need knowledge base ingestion alongside user memory

You want production memory without graph architecture planning

Typed categories beat graph nodes for preference retrieval

Your team needs a benchmark page for internal sign-off

Consider Zep when

You need temporal entity relationship modeling as a core feature

Your agents must reason about complex graph structures

The Graphiti architecture fits your use case specifically

Common questions

What people ask before deciding

Is the 31-point gap a fair comparison?

Both scores come from the same LongMemEval preference recall task. RetainDB published methodology at retaindb.com/benchmark. Zep's score is from the same benchmark surface. Verify it yourself — the methodology is public.

Does RetainDB have any graph features?

Yes — optional graph traversal via include_graph with configurable depth. It's not the primary retrieval path, but entity relationship data is stored and queryable.

Why does a graph score lower on preference recall?

Graph traversal is optimised for entity-relationship reasoning, not preference string recall. 'I prefer dark mode' is better recalled by hybrid search over typed preference memories than by graph edge traversal.

Sources

Zep official site Zep architecture paper (PDF)RetainDB benchmark

Start today — free