Back to Arena
H2O
by UT Austin / Rice / CMU / Stanford / Meta (Zhang et al.)
System Card
OrganizationUT Austin / Rice / CMU / Stanford / Meta (Zhang et al.)
Released2023-06
Architecturekv-cache-extension / Dynamic KV eviction of non-Heavy-Hitter tokens
DetailsFrames KV cache eviction as a dynamic submodular problem with theoretical guarantees. Retains a balance of recent tokens and "heavy hitters" (high attention-score accumulators), evicting the rest to shrink the cache.
Parameters—
Domainlong-context
Open SourceYes
PaperView Paper
CodeRepository
neurips-2023kv-evictionheavy-hitterthroughput
Capability Profile
Benchmark Scores
6 of 14 benchmarksData Transparency:6 estimated
Long-Context Retrieval5/5
Multi-Turn Recall0/2
LoCoMo
no dataMemoryBank
no dataCross-Session Memory0/1
LongMemEval
no dataMulti-Hop QA1/3
Agent Task Memory0/1
AgentBench-Mem
no dataPersonalization0/1
PerLTQA
no dataFactuality / Grounding0/1
RAGAS
no data