Back to Benchmarks

∞Bench

InfiniteBench: Extending Long Context Evaluation Beyond 100K Tokens

Benchmark Metadata

PublisherTsinghua / OpenBMB
VenueACL 2024
Evaluation Typeautomatic
Dimensions12
Test Prompts3,946
ScoringHigher is better
Update Frequencyannual
LeaderboardView Leaderboard

What It Measures

  • Retrieval at 100k+ tokens
  • Math and code over long contexts
  • Novel and dialogue QA
  • Key-value retrieval
  • Summarization over book-length input

What It Does Not Measure

  • Multi-session memory
  • Personalization
  • Real-time latency

All Systems Evaluated(32 systems)

RankSystemScore
#1EM-LLMem-llm (academic consortium)96.7
#2Titanslucidrains (community) / paper by Google Research87.4
#3LM-InfiniteIllinois / Meta (Han et al.)85
#4MambaCMU / Princeton (Gu, Dao)85
#5ScissorhandsRice / Stanford / Meta (Liu et al.)84.5
#6Jina AI EmbeddingsJina AI GmbH83.1
#7Compressive TransformerDeepMind (Rae et al.)82.8
#8R3MemHKUST (2025)81.9
#9Memorizing TransformerGoogle Research (Wu, Rabe, Hutchins, Szegedy)80.3
#10MemoryLLMUCSD / Apple (Wang et al.)80
#11Landmark AttentionEPFL (Mohtashami, Jaggi)79.5
#12TRIMEPrinceton NLP (Zhong, Lei, Chen)79.5
#13GAMVectorSpaceLab (BAAI-related)79
#14RAPTORStanford (Sarthi, Abdullah et al.)79
#15LongMemUCSB / Microsoft Research78.7
#16MemformerUC Santa Barbara / Amazon (Wu, Lan, Liu, et al.)78.7
#17∞ FormerInstituto de Telecomunicações / DeepMind / IST (Martins, Marinho, Martins)78.2
#18ICAEMicrosoft Research (Ge et al.)77.8
#19Recurrent Memory TransformerMIPT / DeepPavlov (Bulatov, Kuratov, Burtsev)77.8
#20Activation BeaconBAAI / Renmin University (Zhang et al.)77.5
#21HEMAindependent (Ahn et al.)77.2
#22Memory³Institute for Advanced Algorithms Research Shanghai / Peking University77.2
#23H2OUT Austin / Rice / CMU / Stanford / Meta (Zhang et al.)76.2
#24SCMBeihang / NLPR (Wang et al.)76.2
#25ReMeModelScope (Alibaba)75
#26Adept AIAdept AI Labs (acquired by Amazon 2024)73.5
#27RWKVRWKV Foundation / BlinkDL community72.6
#28AgentScopeModelScope (Alibaba)70.3
#29LanceDBLanceDB Inc. (YC S22)70.3
#30StreamingLLMMIT Han Lab / Meta AI (Xiao et al.)70.1
#31Activeloop Deep LakeActiveloop Inc.69.3
#32MemoRAGBAAI / Qhjqhj0024.5