Back to Arena

PaperQA2

by FutureHouse

System Card

OrganizationFutureHouse
Released2024-01
Architectureagentic-workflow / Agentic RAG for scientific papers (3-phase)
DetailsThree-phase agent (search -> evidence gathering with embedding+LLM re-scoring -> answer). Metadata-aware embeddings, automatic paper metadata + citation/retraction checks, multimodal tables/figures/equations.
Parameters
Domainrag-retrieval
Open SourceYes
WebsiteVisit
scienceagentic-ragcitationsmultimodal

Capability Profile

Benchmark Scores

5 of 14 benchmarks
Long-Context Retrieval
2/5
RULER
69.131p
NIAH
no data
LooGLE
no data
∞Bench
no data
Multi-Turn Recall
0/2
LoCoMo
no data
MemoryBank
no data
Cross-Session Memory
0/1
LongMemEval
no data
Multi-Hop QA
2/3
BABILong
no data
HotpotQA
72.961p
Agent Task Memory
0/1
AgentBench-Mem
no data
Personalization
0/1
PerLTQA
no data
Factuality / Grounding
1/1
RAGAS
69.148p