Sibyl Memory vs vector databases · agent memory comparison

01 The numbers

LongMemEval Oracle, April 2026.

500 questions, six categories, claude opus 4.6 as judge. The public leaderboard snapshot at the time of our run.

Rank	System	Score	Architecture
1	agentmemory V4	96.2%	embedding-based
2	Sibyl Memory	95.6%	file-based · zero vectors
2	Chronos (PwC)	95.6%	embedding-based
4	Mastra Observational Memory	94.9%	embedding-based
5	MemMachine	93.0%	embedding-based
6	Hindsight (Vectorize)	91.4%	vector DB
Mem0 · Zep · Supermemory · Emergence AI · Oracle baseline. all below the top tier.

Sibyl Memory placed second by 0.6%. Without a vector database. On 4 vCPU and 16GB RAM. Five of the six top systems (including the leader) use embeddings. We chose not to. The score is the side effect of optimizing for production efficiency first, retrieval recall second.

Full benchmark report (methodology, per-category, ablations) ↗

02 The architectures

Three approaches. One question of when.

How agent memory systems are actually built. The trade-offs are structural, not preference-based.

01

Vector database

Stores embeddings. Retrieves by similarity.

Every memory write becomes an embedding stored in a vector index (Pinecone, Weaviate, Qdrant, pgvector, Chroma). Reads issue a query embedding and pull the top-k by cosine similarity. Structure is inferred at read time. Strong for fuzzy semantic recall ("did we ever talk about X?"). Weak for exact lookups ("what's the customer's stripe ID?"), temporal reasoning ("what did we agree last tuesday?"), and any query that needs a precise schema.

Examples: Hindsight (Vectorize), most LangChain memory implementations, RAG-flavored agent stacks.

Vector

02

Embedding-based with structure

Embeddings plus a graph or schema layer.

A hybrid: embeddings for semantic recall, plus a graph or relational layer for structured facts. Better than pure vectors. Pays the embedding cost on every write and the structure cost on every query. The complexity surface is two systems instead of one. Most top-tier benchmark entries (agentmemory V4, Chronos, Mastra, MemMachine) sit here.

Examples: Mem0 (vectors + graph), Zep (vectors + temporal graph), agentmemory V4, Chronos (PwC).

Hybrid

03

File-based, schema-first

Postgres tables. No embeddings. No vector index.

Memory writes go through a schema imposed at write time: priorities, journal, entities, relationships, scars, arc. Reads are standard SQL against indexed namespaces. No embedding cost, no vector index, no retrieval latency. The trade is: you have to know the shape of memory ahead of time. Most teams thought that was impossible. The Sibyl Memory architecture proved it isn't, by hitting 95.6% on LongMemEval Oracle without a single embedding.

Example: Sibyl Memory. The architecture is documented at /memory#architecture and the benchmark at /memory#benchmark.

File-based

Why schema-first wins on the benchmark.

LongMemEval Oracle tests six categories. Two of them (single-session-user, single-session-assistant) are about precise recall of structured facts within a session. We score 100% on both. Vector approaches lose accuracy on these because similarity isn't lookup. Two more (temporal reasoning, knowledge update) are about recalling facts as they evolve over time. Schema models temporal relationships natively; embeddings lose precision when the same entity appears at multiple time points.

The category scores: 100% / 100% / 96.2% / 93.3% / 93.2% / 92.3%. The places we score lower (single-session-pref, multi-session, knowledge-update) are also the places vector approaches score higher. Different tools, different curves. The point is: schema-first is competitive on the benchmark vector approaches were built for, and it ships in a fraction of the operational complexity.

03 When each fits

Pick the architecture, not the brand.

Use-case patterns that fit each approach. Honest, including where vectors win.

01

Choose vector / embedding-based when

Semantic recall is the primary use case.

Document Q&A over an unstructured corpus. RAG-flavored chatbots over knowledge bases. Recommendation systems where similarity is the signal. Long-form summarization where the agent needs to surface "topics related to X" rather than "the exact answer to X". Anywhere structure is impossible to define ahead of time and fuzzy is acceptable.

Don't fight the architecture: if your problem is semantic similarity, use a vector DB.

Vector wins

02

Choose hybrid (embeddings + graph/schema) when

You need both fuzzy and precise.

Customer support agents that need to recall both "anything related to billing issues" (fuzzy) and "the customer's exact subscription tier" (precise). Research agents that span semi-structured documents. Cases where you can pay for the operational complexity of two systems and have engineering capacity to maintain both.

The cost: two stores to keep in sync, two query paths, embedding latency on every write.

Hybrid wins

03

Choose file-based / schema-first when

Precision, latency, and operational simplicity matter.

Autonomous agents that need to act on memory, not just describe it. Multi-tenant platforms where each user's memory must be isolated and auditable. Compliance-sensitive deployments (GDPR cascade delete, EU AI Act export, tamper-proof audit log). Situations where you need exact recall of structured facts (customer IDs, transaction history, decisions made on a specific date). Production systems where p50 latency matters more than fuzzy recall.

The trade: you have to design the schema. We've done the design work for you in Sibyl Memory's five operating shapes (operator, user-profile, conversational continuity, agent reputation, org memory).

Schema wins

Most agent products use vectors because they were the first available primitive. That does not make them right. It makes them familiar. Schema-first works for the same reason relational databases worked for fifty years: structure is cheaper to maintain than to infer.

04 Contact

Want to migrate?

We've helped teams move from vector-DB-flavored memory to schema-first. The mechanical part is straightforward; the design part is where the value is.

Pilot Sibyl Memory

Free tier (100 MAU) is live at /memory. For pilots above that, or to discuss your specific architecture trade-offs, reach out.

Migration consult

Moving from Mem0, Zep, agentmemory, pgvector, or a custom vector stack? Scoped engagement to map your current memory model into a schema-first design. Monthly retainer where appropriate.

Email [email protected] DM @sibylcap Read Lab overview

Sibyl Memory vs vector databases.