Back to Arena

Reflexion

by Northeastern / MIT / Princeton (Shinn et al.)

System Card

OrganizationNortheastern / MIT / Princeton (Shinn et al.)
Released2023-03
Architectureagentic-workflow / Verbal reinforcement via episodic reflection buffer
DetailsAgents verbally reflect on task feedback signals, maintaining their own reflective text in an episodic memory buffer to induce better decisions in subsequent trials. Avoids weight updates by using language as a policy encoding.
Parameters
Domainagent-memoryepisodic-sessionlifelong-learning
Open SourceYes
verbal-rlself-reflectionepisodic-bufferneurips-2023

Capability Profile

Benchmark Scores

6 of 14 benchmarks
Long-Context Retrieval
0/5
RULER
no data
NIAH
no data
LooGLE
no data
LongBench
no data
∞Bench
no data
Multi-Turn Recall
2/2
LoCoMo
81.291p
MemoryBank
78.470p
Cross-Session Memory
1/1
Multi-Hop QA
2/3
BABILong
72.725p
MultiHop-RAG
no data
Agent Task Memory
1/1
Personalization
0/1
PerLTQA
no data
Factuality / Grounding
0/1
RAGAS
no data