RAPTOR

by Stanford (Sarthi, Abdullah et al.)

System Card

OrganizationStanford (Sarthi, Abdullah et al.)

Released2024-01

Architecturehierarchical-summary / Recursive bottom-up clustering + summarization tree

DetailsRecursively embeds, clusters, and summarizes chunks to build a multi-level tree. Inference retrieves across tree levels, integrating information at multiple abstraction granularities.

Parameters—

Domainrag-retrievallong-context

Open SourceYes

PaperView Paper

CodeRepository

iclr-2024treerecursive-summarymulti-hop

Capability Profile

Benchmark Scores

6 of 14 benchmarks

Long-Context Retrieval

4/5

RULER

70.145p

NIAH

no data

LooGLE

77.145p

LongBench

603p

∞Bench

7956p

Multi-Turn Recall

0/2

LoCoMo

no data

MemoryBank

no data

Cross-Session Memory

0/1

LongMemEval

no data

Multi-Hop QA

2/3

BABILong

7658p

MultiHop-RAG

no data

HotpotQA

6530p

Agent Task Memory

0/1

AgentBench-Mem

no data

Personalization

0/1

PerLTQA

no data

Factuality / Grounding

0/1

RAGAS

no data

Sources:RAPTOR paper (arXiv:2401.18059); evaluated on LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding (Tsinghua KEG, 2308)RAPTOR paper (arXiv:2401.18059); evaluated on RULER: What's the Real Context Size of Your Long-Context Language Models (NVIDIA, 2404)RAPTOR paper (arXiv:2401.18059); evaluated on BABILong: Testing the Limits of LLMs with Long-Context Reasoning-in-a-Haystack (AIRI, 2406)RAPTOR paper (arXiv:2401.18059); evaluated on HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (Stanford / CMU, 1809)RAPTOR paper (arXiv:2401.18059); evaluated on InfiniteBench: Extending Long Context Evaluation Beyond 100K Tokens (Tsinghua / OpenBMB, 2402)RAPTOR paper (arXiv:2401.18059); evaluated on LooGLE: Can Long-Context Language Models Understand Long Contexts? (Peking University, 2311)