Back to Arena

H2O

by UT Austin / Rice / CMU / Stanford / Meta (Zhang et al.)

System Card

OrganizationUT Austin / Rice / CMU / Stanford / Meta (Zhang et al.)
Released2023-06
Architecturekv-cache-extension / Dynamic KV eviction of non-Heavy-Hitter tokens
DetailsFrames KV cache eviction as a dynamic submodular problem with theoretical guarantees. Retains a balance of recent tokens and "heavy hitters" (high attention-score accumulators), evicting the rest to shrink the cache.
Parameters
Domainlong-context
Open SourceYes
neurips-2023kv-evictionheavy-hitterthroughput

Capability Profile

Benchmark Scores

6 of 14 benchmarks
Long-Context Retrieval
5/5
RULER
74.477p
NIAH
76.169p
LooGLE
75.941p
∞Bench
76.225p
Multi-Turn Recall
0/2
LoCoMo
no data
MemoryBank
no data
Cross-Session Memory
0/1
LongMemEval
no data
Multi-Hop QA
1/3
BABILong
73.941p
MultiHop-RAG
no data
HotpotQA
no data
Agent Task Memory
0/1
AgentBench-Mem
no data
Personalization
0/1
PerLTQA
no data
Factuality / Grounding
0/1
RAGAS
no data