Back to Benchmarks
MultiHop-RAG
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries
Benchmark Metadata
PublisherHKUST
VenueCOLM 2024
Evaluation Typeautomatic
Dimensions4
Test Prompts2,556
ScoringHigher is better
Update Frequencyannual
PaperView Paper
LeaderboardView Leaderboard
What It Measures
- Inference, comparison, and temporal multi-hop queries
- Retrieval recall@k for evidence chunks
- Final answer accuracy
What It Does Not Measure
- Personalization
- Cross-session memory
- Long-context window stress
All Systems Evaluated(149 systems)
5 self-reported144 estimated
| Rank | System | Score |
|---|---|---|
| #1 | QdrantQdrant | 75 |
| #2 | AllegroGraphFranz Inc. | 75 |
| #3 | AppAgentTencent / mnotgod96 | 75 |
| #4 | Athina AIAthina AI (YC W23) | 75 |
| #5 | AutoWebGLMTHUDM | 75 |
| #6 | Backboard IOBackboard.io | 75 |
| #7 | BotpressBotpress Inc. | 75 |
| #8 | BrowserGymServiceNow Research | 75 |
| #9 | D-MemYou et al. (2025) | 75 |
| #10 | DB-GPTeosphoros-ai | 75 |
| #11 | DiffbotDiffbot Inc. | 75 |
| #12 | DifyLangGenius | 75 |
| #13 | GraphRAG-SDKFalkorDB | 75 |
| #14 | HiMemZhu et al. (JD.com, 2026) | 75 |
| #15 | HippoRAGOSU NLP Group (Ohio State University) | 75 |
| #16 | HippoRAG 2OSU NLP Group | 75 |
| #17 | HuggingGPT / JARVISMicrosoft Research | 75 |
| #18 | KnowAgentzjunlp (Zhejiang University) | 75 |
| #19 | MCP Memory ServerAnthropic / Model Context Protocol | 75 |
| #20 | MemR32025 (December submission) | 75 |
| #21 | MIRIXMIRIX AI (Wang, Chen) | 75 |
| #22 | MultiOnMultiOn (now AGI Inc.) | 75 |
| #23 | Nano GraphRAGgusye1234 | 75 |
| #24 | Ontotext GraphDBOntotext / Graphwise (merged with Semantic Web Company, 2025) | 75 |
| #25 | PathRAGBUPT-GAMMA | 75 |
| #26 | RMMGoogle / UCSB (2025) | 75 |
| #27 | StardogStardog Union Inc. | 75 |
| #28 | SuperAGITransformerOptimus | 75 |
| #29 | TrustRAGGoMate Community | 75 |
| #30 | Vellum AIVellum AI Inc. (YC W23) | 75 |
| #31 | xmemoryxmemory Inc. | 75 |
| #32 | CradleBAAI-Agents | 74.7 |
| #33 | HebbiaHebbia, Inc. | 74.6 |
| #34 | R2RSciPhi-AI | 74.6 |
| #35 | LangGraphLangChain | 74.5 |
| #36 | Onyxonyx-dot-app | 74.5 |
| #37 | VoiceflowVoiceflow Inc. | 74.4 |
| #38 | AutoGen Core MemoryMicrosoft | 74.3 |
| #39 | KAGOpenSPG / Ant Group | 74.3 |
| #40 | LagentInternLM (Shanghai AI Lab) | 74.3 |
| #41 | CognigyCognigy GmbH (acquired by NICE, July 2025) | 74.2 |
| #42 | Neo4j AuraDBNeo4j Inc. | 74.2 |
| #43 | Neo4j LLM Graph BuilderNeo4j Labs | 74.2 |
| #44 | WebVoyagerMinorJerry et al. | 74.2 |
| #45 | AutoGen StudioMicrosoft Research | 74.1 |
| #46 | VectorShiftVectorShift Inc. (YC S23) | 74.1 |
| #47 | GraphRAGMicrosoft | 74 |
| #48 | LangSmith LangGraph CloudLangChain Inc. | 73.9 |
| #49 | Self-RAGUniversity of Washington / Allen AI (Asai et al.) | 73.9 |
| #50 | AutoGPT PlatformSignificant Gravitas | 73.7 |
| #51 | Galileo AIGalileo Technologies Inc. | 73.6 |
| #52 | RAGFlowInfiniFlow | 73.5 |
| #53 | MemoryScopeAlibaba ModelScope | 73.2 |
| #54 | CrewAICrewAI Inc. (Joao Moura) | 73.2 |
| #55 | Swarmskyegomez / Swarms Corp | 73.2 |
| #56 | Maxim AIMaxim AI Inc. | 73.1 |
| #57 | Open InterpreterOpenInterpreter | 72.9 |
| #58 | MilvusZilliz | 72.8 |
| #59 | AgentVerseOpenBMB (Tsinghua) | 72.8 |
| #60 | Dust ttDust (formerly XP1) | 72.7 |
| #61 | Lindy AILindy AI | 72.7 |
| #62 | ChatDBTsinghua University (Hu et al.) | 72.6 |
| #63 | AriGraphAIRI Institute / Moscow | 72.5 |
| #64 | Mixedbread AIMixedbread AI | 72.5 |
| #65 | PaperQA2FutureHouse | 72.5 |
| #66 | Bishengdataelement | 72.4 |
| #67 | SID AISID (YC) | 72.4 |
| #68 | SupermemorySupermemory | 72.4 |
| #69 | MoTFudan (Li, Qiu) | 72.3 |
| #70 | Haystack Memorydeepset | 72.1 |
| #71 | LangflowLangflow-ai (DataStax) | 72 |
| #72 | NemoriNemori AI (independent) | 72 |
| #73 | Qwen-AgentQwenLM (Alibaba) | 72 |
| #74 | SynapseNTU / Salesforce (Zheng et al.) | 72 |
| #75 | PineconePinecone Systems | 71.9 |
| #76 | LightRAGHKUDS (HKU Data Intelligence Lab) | 71.9 |
| #77 | HoneyHiveHoneyHive Inc. | 71.7 |
| #78 | FlowiseFlowiseAI | 71.3 |
| #79 | HybridAGISynaLinks | 71.3 |
| #80 | LarimarIBM Research | 71.3 |
| #81 | MarkerDatalab (datalab-to) | 71.3 |
| #82 | CAMELCAMEL-AI.org | 71.1 |
| #83 | GleanGlean Technologies | 71.1 |
| #84 | LangMemLangChain | 71 |
| #85 | Cohere EmbedCohere Inc. | 70.9 |
| #86 | Kore AIKore.ai Inc. | 70.8 |
| #87 | WeaviateWeaviate | 70.7 |
| #88 | ChatDev 2.0OpenBMB | 70.5 |
| #89 | AGiXTJosh-XT | 70.4 |
| #90 | Think-in-MemoryAnt Group / Alibaba (Liu et al.) | 70.4 |
| #91 | Memoripycaspianmoon | 70.3 |
| #92 | SynapseNanyang Technological University (Zheng et al.) | 70.3 |
| #93 | LlamaIndex MemoryLlamaIndex | 70.2 |
| #94 | MetaGPTDeepWisdom / geekan | 70.2 |
| #95 | Stack AIStack AI Inc. (YC W23) | 70.2 |
| #96 | GPTeam101dotxyz | 69.7 |
| #97 | Nomic AtlasNomic AI Inc. | 69.7 |
| #98 | Mobile-AgentAlibaba Tongyi Lab (X-PLUG) | 69.4 |
| #99 | ChromaChroma | 69.2 |
| #100 | FastGPTlabring | 69.2 |
| #101 | SelfmemTsinghua / Microsoft (Cheng et al.) | 69 |
| #102 | Voyage AIVoyage AI (acquired by MongoDB, Feb 2025) | 69 |
| #103 | MiniRAGHKUDS | 68.4 |
| #104 | MemoChatUniversity of Warwick / Alibaba | 67.9 |
| #105 | MemoriGibsonAI | 67.6 |
| #106 | GraphitiZep AI | 66.4 |
| #107 | Mnemosyneindependent | 66.1 |
| #108 | ParadeDBParadeDB Inc. (YC S23) | 65.8 |
| #109 | REALMGoogle Research (Guu et al.) | 65.6 |
| #110 | Couchbase VectorCouchbase Inc. | 65.5 |
| #111 | txtaiNeuML | 65.4 |
| #112 | Elasticsearch VectorElastic N.V. | 65.4 |
| #113 | PrivateGPTZylon AI | 65.3 |
| #114 | Mem AIMem Labs | 64.9 |
| #115 | VectaraVectara Inc. | 64.9 |
| #116 | ZepZep AI | 64.8 |
| #117 | Astra DBDataStax | 64.6 |
| #118 | Saner AISaner.AI | 64.6 |
| #119 | MarqoMarqo Pty Ltd | 64.3 |
| #120 | MyScaleMyScale Inc. | 64.1 |
| #121 | ColPaliilluin-tech | 63.9 |
| #122 | RETRODeepMind (Borgeaud et al.) | 63.8 |
| #123 | MemoryBankHarbin Institute of Technology / SenseTime | 63.6 |
| #124 | OpenSearch VectorOpenSearch Project (AWS-led) | 63.2 |
| #125 | Neon VectorNeon Inc. | 63.1 |
| #126 | Redis VectorRedis Ltd. | 63 |
| #127 | AtlasMeta AI FAIR (Izacard et al.) | 62.9 |
| #128 | Carbon AICarbon (acquired by Perplexity, Dec 2024) | 62.9 |
| #129 | CognitaTrueFoundry | 62.9 |
| #130 | LlamaCloudLlamaIndex Inc. | 62.9 |
| #131 | Unstructured IOUnstructured Technologies Inc. | 62.8 |
| #132 | MendableMendable (YC-backed) | 62.7 |
| #133 | Manticore SearchManticore Software Ltd. | 62.3 |
| #134 | VerbaWeaviate | 62.3 |
| #135 | EpsillaEpsilla Inc. (YC S23) | 61.6 |
| #136 | RagieRagie Inc. | 61.5 |
| #137 | pgvector Supabase Neonpgvector OSS / Supabase Inc. / Neon Inc. | 61.3 |
| #138 | TurboPufferTurboPuffer Inc. | 60.9 |
| #139 | Supabase VectorSupabase Inc. | 60.8 |
| #140 | Vespa AIYahoo / Vespa.ai (independent OSS project) | 60.5 |
| #141 | QuivrQuivrHQ | 60.4 |
| #142 | SingleStore VectorSingleStore Inc. | 60.4 |
| #143 | AnythingLLMMintplex Labs | 60.3 |
| #144 | MongoDB Atlas VectorMongoDB Inc. | 60.1 |
| #145 | MemoroMIT Media Lab | 60 |
| #146 | ValdYahoo Japan | 60 |
| #147 | vectorizeVectorize Inc. | 59.3 |
| #148 | CogneeCognee | 57.2 |
| #149 | MemaryKingjulio8238 | 52.7 |