Back to Arena
Self-RAG
by University of Washington / Allen AI (Asai et al.)
System Card
OrganizationUniversity of Washington / Allen AI (Asai et al.)
Released2023-10
Architectureagentic-workflow / Self-reflective on-demand retrieval with reflection tokens
DetailsTrains a single LM that adaptively decides when to retrieve, then emits reflection tokens to critique retrieved passages and its own generations. Reflection tokens make the LM controllable at inference time.
Parameters—
Domainrag-retrievalagent-memory
Open SourceYes
PaperView Paper
WebsiteVisit
CodeRepository
iclr-2024-oralreflection-tokensadaptive-retrievalfactuality
Capability Profile
Benchmark Scores
6 of 14 benchmarksLong-Context Retrieval1/5
Multi-Turn Recall1/2
MemoryBank
no dataCross-Session Memory1/1
Multi-Hop QA2/3
Agent Task Memory1/1
Personalization0/1
PerLTQA
no dataFactuality / Grounding0/1
RAGAS
no dataSources:Self-RAG paper (arXiv:2310.11511); evaluated on HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (Stanford / CMU, 1809)Self-RAG paper (arXiv:2310.11511); evaluated on MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries (HKUST, 2401)Self-RAG paper (arXiv:2310.11511); evaluated on AgentBench Memory Track (Tsinghua KEG, 2308)Self-RAG paper (arXiv:2310.11511); evaluated on LoCoMo: Long-Term Conversational Memory Benchmark (Snap Research, 2402)Self-RAG paper (arXiv:2310.11511); evaluated on LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding (Tsinghua KEG, 2308)Self-RAG paper (arXiv:2310.11511); evaluated on LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory (Salesforce AI Research, 2410)