Back to Arena
HEMA
by independent (Ahn et al.)
System Card
Organizationindependent (Ahn et al.)
Released2025-04
Architecturehierarchical-summary / Dual Compact + Vector memory
DetailsCombines Compact Memory (a one-sentence narrative summary continuously updated) with Vector Memory (episodic chunk embeddings queried by cosine similarity). On a 6B transformer, maintains dialogues beyond 300 turns under 3,500 tokens prompt length.
Parameters—
Domainepisodic-sessionlong-contextagent-memory
Open SourceNo
PaperView Paper
hippocampusdual-memorycompact-summaryconversational
Capability Profile
Benchmark Scores
6 of 14 benchmarksLong-Context Retrieval1/5
Multi-Turn Recall1/2
MemoryBank
no dataCross-Session Memory1/1
Agent Task Memory1/1
Personalization0/1
PerLTQA
no dataFactuality / Grounding0/1
RAGAS
no dataSources:HEMA paper (arXiv:2504.16754); evaluated on AgentBench Memory Track (Tsinghua KEG, 2308)HEMA paper (arXiv:2504.16754); evaluated on LoCoMo: Long-Term Conversational Memory Benchmark (Snap Research, 2402)HEMA paper (arXiv:2504.16754); evaluated on LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory (Salesforce AI Research, 2410)HEMA paper (arXiv:2504.16754); evaluated on BABILong: Testing the Limits of LLMs with Long-Context Reasoning-in-a-Haystack (AIRI, 2406)HEMA paper (arXiv:2504.16754); evaluated on HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (Stanford / CMU, 1809)HEMA paper (arXiv:2504.16754); evaluated on InfiniteBench: Extending Long Context Evaluation Beyond 100K Tokens (Tsinghua / OpenBMB, 2402)