Back to Arena
MemR3
by 2025 (December submission)
System Card
Organization2025 (December submission)
Released2025-12
Architectureagentic-workflow / LangGraph closed-loop retrieve-reflect-answer router
DetailsAutonomous memory-retrieval controller built on LangGraph. Router chooses among retrieve/reflect/answer actions; a global evidence-gap tracker monitors what evidence is still missing. Agnostic to backend retrievers (vector, graph, hybrid).
Parameters—
Domainagent-memoryrag-retrieval
Open SourceNo
PaperView Paper
langgraphclosed-loopevidence-gaplocomo
Capability Profile
Benchmark Scores
6 of 14 benchmarksLong-Context Retrieval1/5
Multi-Turn Recall1/2
MemoryBank
no dataCross-Session Memory1/1
Multi-Hop QA2/3
Agent Task Memory1/1
Personalization0/1
PerLTQA
no dataFactuality / Grounding0/1
RAGAS
no dataSources:arXiv:2512.20237 Table 1 — GPT-4.1-mini + RAG backbone, LLM-as-Judge overallMemR3 paper (arXiv:2512.20237); evaluated on HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (Stanford / CMU, 1809)MemR3 paper (arXiv:2512.20237); evaluated on MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries (HKUST, 2401)MemR3 paper (arXiv:2512.20237); evaluated on AgentBench Memory Track (Tsinghua KEG, 2308)MemR3 paper (arXiv:2512.20237); evaluated on LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding (Tsinghua KEG, 2308)MemR3 paper (arXiv:2512.20237); evaluated on LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory (Salesforce AI Research, 2410)