Back to Arena
Mnemosyne
by independent
System Card
Organizationindependent
Released2025-10
Architecturegraph-rag / Human-inspired probabilistic recall with decay
DetailsUnsupervised long-term memory for edge LLMs using graph-structured storage, modular substance/redundancy filters, memory pruning, and probabilistic recall with temporal decay modeled on human memory. Includes a "core summary" of personality/domain details.
Parameters—
Domainpersonalizationepisodic-sessionagent-memory
Open SourceNo
PaperView Paper
edge-llmdecaygraphunsupervisedpersonalized
Capability Profile
Benchmark Scores
6 of 14 benchmarksLong-Context Retrieval0/5
RULER
no dataNIAH
no dataLooGLE
no dataLongBench
no data∞Bench
no dataMulti-Turn Recall2/2
Cross-Session Memory1/1
Multi-Hop QA2/3
Agent Task Memory1/1
Personalization0/1
PerLTQA
no dataFactuality / Grounding0/1
RAGAS
no dataSources:arXiv:2510.08601 Table 2 — Overall J-score, Llama3.1-8B-Instruct; SingleHop 62.78, MultiHop 49.53, OpenDomain 60.42, Temporal 53.03Mnemosyne paper (arXiv:2510.08601); evaluated on LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory (Salesforce AI Research, 2410)Mnemosyne paper (arXiv:2510.08601); evaluated on AgentBench Memory Track (Tsinghua KEG, 2308)Mnemosyne paper (arXiv:2510.08601); evaluated on MemoryBank: Enhancing LLMs with Long-Term Memory (Sun Yat-sen University, 2305)Mnemosyne paper (arXiv:2510.08601); evaluated on HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (Stanford / CMU, 1809)Mnemosyne paper (arXiv:2510.08601); evaluated on MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries (HKUST, 2401)