Back to Arena
Landmark Attention
by EPFL (Mohtashami, Jaggi)
System Card
OrganizationEPFL (Mohtashami, Jaggi)
Released2023-05
Architecturekv-cache-extension / Block-level landmark tokens with direct attention retrieval
DetailsInserts landmark tokens representing each input block, and trains attention to use them for selecting relevant blocks. Retrieval flows through the model's own attention mechanism, preserving random access to the full context.
Parameters—
Domainlong-context
Open SourceYes
PaperView Paper
CodeRepository
neurips-2023random-accessblockretrieval-by-attention
Capability Profile
Benchmark Scores
6 of 14 benchmarksMulti-Turn Recall0/2
LoCoMo
no dataMemoryBank
no dataCross-Session Memory0/1
LongMemEval
no dataMulti-Hop QA1/3
Agent Task Memory0/1
AgentBench-Mem
no dataPersonalization0/1
PerLTQA
no dataFactuality / Grounding0/1
RAGAS
no dataSources:Landmark Attention paper (arXiv:2305.16300); evaluated on BABILong: Testing the Limits of LLMs with Long-Context Reasoning-in-a-Haystack (AIRI, 2406)Landmark Attention paper (arXiv:2305.16300); evaluated on InfiniteBench: Extending Long Context Evaluation Beyond 100K Tokens (Tsinghua / OpenBMB, 2402)Landmark Attention paper (arXiv:2305.16300); evaluated on LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding (Tsinghua KEG, 2308)Landmark Attention paper (arXiv:2305.16300); evaluated on LooGLE: Can Long-Context Language Models Understand Long Contexts? (Peking University, 2311)Landmark Attention paper (arXiv:2305.16300); evaluated on Needle in a Haystack (Greg Kamradt, 2024)Landmark Attention paper (arXiv:2305.16300); evaluated on RULER: What's the Real Context Size of Your Long-Context Language Models (NVIDIA, 2404)