Back to Arena
HuggingGPT / JARVIS
by Microsoft Research
System Card
OrganizationMicrosoft Research
Released2023-03
Architectureknowledge-base / LLM controller + expert-model registry
DetailsFour-stage controller (task planning, model selection, execution, response) using an LLM plus a registry of Hugging Face expert models. EasyTool (2024) and TaskBench for evaluation.
Parameters—
Domainagent-memoryknowledge-graph
Open SourceYes
PaperView Paper
CodeRepository
jarvishuggingfacetask-planningcanonical
Capability Profile
Benchmark Scores
6 of 14 benchmarksData Transparency:6 estimated
Long-Context Retrieval0/5
RULER
no dataNIAH
no dataLooGLE
no dataLongBench
no data∞Bench
no dataMulti-Turn Recall1/2
MemoryBank
no dataCross-Session Memory1/1
Multi-Hop QA2/3
Agent Task Memory1/1
Personalization0/1
PerLTQA
no dataFactuality / Grounding1/1