Back to Arena
HiMem
by Zhu et al. (JD.com, 2026)
System Card
OrganizationZhu et al. (JD.com, 2026)
Released2026-01
Architecturehierarchical-summary / Episode Memory + Note Memory with conflict-aware updates
DetailsTwo-tier memory: Episode Memory built via topic-aware event-surprise dual-channel segmentation, plus Note Memory from multi-stage information extraction. Hybrid + best-effort retrieval with conflict-aware updates for self-evolution.
Parameters—
Domainagent-memoryepisodic-session
Open SourceYes
PaperView Paper
CodeRepository
hierarchicalepisode-notelong-horizontopic-segmentation
Capability Profile
Benchmark Scores
6 of 14 benchmarksLong-Context Retrieval0/5
RULER
no dataNIAH
no dataLooGLE
no dataLongBench
no data∞Bench
no dataMulti-Turn Recall2/2
Cross-Session Memory1/1
Multi-Hop QA2/3
Agent Task Memory1/1
Personalization0/1
PerLTQA
no dataFactuality / Grounding0/1
RAGAS
no dataSources:arXiv:2601.06377 Table 1 — Overall GPT-Score; F1 34.95. Per-category: SingleHop 89.22, MultiHop 70.92, Temporal 74.77, OpenDomain 54.86HiMem paper (arXiv:2601.06377); evaluated on AgentBench Memory Track (Tsinghua KEG, 2308)HiMem paper (arXiv:2601.06377); evaluated on LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory (Salesforce AI Research, 2410)HiMem paper (arXiv:2601.06377); evaluated on HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (Stanford / CMU, 1809)HiMem paper (arXiv:2601.06377); evaluated on MemoryBank: Enhancing LLMs with Long-Term Memory (Sun Yat-sen University, 2305)HiMem paper (arXiv:2601.06377); evaluated on MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries (HKUST, 2401)