Developers

Resources

Usecases

Pricing

Docs

Star

research_navbar_get-started

Research

Start Free

Mem0 Research:
Benchmarking a Token-Efficient Memory Algorithm for AI Agents

Benchmarked across LoCoMo, LongMemEval, and BEAM
Powered by single-pass hierarchical extraction and multi-signal retrieval

home_primary_get-started

Website/CTA

Read the blog

home_primary_get-started

Website/CTA

Read the blog

home_primary_get-started

Website/CTA

home_primary_get-started

Website/CTA

92.5

LoCoMo

94.4

LongMemEval

64.1

BEAM 1M

48.6

BEAM 10M

Summary

Mem0's new token-efficient memory algorithm hits 92.5 on LoCoMo, 94.4 on LongMemEval, and 64.1/48.6 on BEAM (1M/10M) while averaging under 7,000 tokens per retrieval call. Full-context approaches on the same benchmarks use 25,000+. High accuracy at 3-4x lower token cost.

BENCHMARKS

LoCoMo

1,540 questions • 5 categories

92.5

OVERALL

6956

Mean Tokens

76.6

94.6

70.2

95.4

57.3

82.3

63.2

92.5

Single-hop

Multi-hop

Open-domain

Temporal

Old

New

LOCOMO

LongMemEval

500 questions • 6 categories

94.4

OVERALL

6787

Mean Tokens

94.3

98.6

46.4

98.2

76.7

96.7

79.5

93.6

51.1

97.0

70.7

88.0

Single-session (user)

Single-session (assistant)

Single-session (preference)

Knowledge update

Temporal reasoning

Multi-session

Old

New

LONGMEMEVAL

BEAM

BEAM 1M: 700 questions • 35 conversations

BEAM 10M: 200 questions • 10 conversations

64.1

OVERALL (1M)

48.6

OVERALL (10M)

6719

Mean Tokens (1m)

6914

Mean Tokens (10m)

88.3

90.4

85.2

82.5

70.0

56.3

65.0

75.0

65.2

26.1

63.5

46.9

61.8

16.3

53.6

20.2

52.5

40.0

35.7

32.5

Preference Following

Instruction Following

Information Extraction

Knowledge Update

Multi Session Reasoning

Summarization

Temporal Reasoning

Event Ordering

Abstention

Contradiction Resolution

10M

BEAM

Data last updated: May 2026.

All results are Old Algorithm vs. New Algorithm.

Full evaluation framework is open-sourced on GitHub.

WHAT’S NEW

Single pass ADD-only extraction

Mem0 now treats agent-generated facts as first-class, closing a significant gap in memory coverage. When an agent confirms an action or provides a recommendation, that information is stored with equal weight.

Multi-signal retrieval

Retrieval stack now runs three scoring passes in parallel and fuses the results: Semantic similarity, Keyword matching, and Entity matching. The combined score outperforms individual signal scores.

What we're building next

Temporal abstraction

Representing how events relate over time, not just what happened. BEAM 10M scores define the current frontier.

Temporal abstraction

Representing how events relate over time, not just what happened. BEAM 10M scores define the current frontier.

Cross-session structure

Modeling how information evolves across sessions. Requires connecting scattered interactions into coherent timelines.

Cross-session structure

Modeling how information evolves across sessions. Requires connecting scattered interactions into coherent timelines.

Agent-native memory

Extraction and retrieval running asynchronously as infrastructure, so agents don’t spend cycles managing their own context.

Agent-native memory

Extraction and retrieval running asynchronously as infrastructure, so agents don’t spend cycles managing their own context.

home_primary_get-started

Website/CTA

Join the team

Mem0 Research: Benchmarking a Token-Efficient Memory Algorithm for AI Agents

92.5

94.4

64.1

48.6

Summary

LoCoMo

92.5

6956

LongMemEval

94.4

6787

BEAM

64.1

48.6

6719

6914

Single pass ADD-only extraction

Multi-signal retrieval

What we're building next

Temporal abstraction

Temporal abstraction

Cross-session structure

Cross-session structure

Agent-native memory

Agent-native memory

Mem0 Research:
Benchmarking a Token-Efficient Memory Algorithm for AI Agents