Back to Benchmarks
RAGAS
RAGAS: Automated Evaluation of Retrieval-Augmented Generation
Benchmark Metadata
PublisherExploding Gradients
VenueEACL 2024
Evaluation Typeautomatic
Dimensions4
Test Prompts0
ScoringHigher is better
Update Frequencycontinuous
PaperView Paper
LeaderboardView Leaderboard
What It Measures
- Faithfulness of generated answers to retrieved context
- Answer relevance to the question
- Context precision and recall
- Hallucination rate
What It Does Not Measure
- Cross-session consistency
- Personalization
- Latency
All Systems Evaluated(67 systems)
| Rank | System | Score |
|---|---|---|
| #1 | Claude ProjectsAnthropic | 76.5 |
| #2 | HippoRAG 2OSU NLP Group | 76.5 |
| #3 | GleanGlean Technologies | 75.4 |
| #4 | AppAgentTencent / mnotgod96 | 75 |
| #5 | KAGOpenSPG / Ant Group | 74.3 |
| #6 | PineconePinecone Systems | 74.2 |
| #7 | HybridAGISynaLinks | 74.2 |
| #8 | MilvusZilliz | 73.4 |
| #9 | ChatDBTsinghua University (Hu et al.) | 73.4 |
| #10 | Neo4j LLM Graph BuilderNeo4j Labs | 73.3 |
| #11 | LightRAGHKUDS (HKU Data Intelligence Lab) | 73.2 |
| #12 | MCP Memory ServerAnthropic / Model Context Protocol | 73.2 |
| #13 | GraphRAGMicrosoft | 73.1 |
| #14 | WeaviateWeaviate | 73 |
| #15 | StardogStardog Union Inc. | 72.8 |
| #16 | QdrantQdrant | 72.6 |
| #17 | HuggingGPT / JARVISMicrosoft Research | 72.3 |
| #18 | LlamaIndex MemoryLlamaIndex | 72.1 |
| #19 | DiffbotDiffbot Inc. | 72 |
| #20 | MarkerDatalab (datalab-to) | 71.8 |
| #21 | R2RSciPhi-AI | 71.7 |
| #22 | PathRAGBUPT-GAMMA | 71.6 |
| #23 | TrustRAGGoMate Community | 71.5 |
| #24 | CogneeCognee | 71.4 |
| #25 | AllegroGraphFranz Inc. | 71.2 |
| #26 | MiniRAGHKUDS | 70.9 |
| #27 | HippoRAGOSU NLP Group (Ohio State University) | 70.8 |
| #28 | Nano GraphRAGgusye1234 | 70.8 |
| #29 | Neo4j AuraDBNeo4j Inc. | 70.7 |
| #30 | DB-GPTeosphoros-ai | 70.6 |
| #31 | ChromaChroma | 70.5 |
| #32 | Haystack Memorydeepset | 70.4 |
| #33 | GraphRAG-SDKFalkorDB | 70.3 |
| #34 | Ontotext GraphDBOntotext / Graphwise (merged with Semantic Web Company, 2025) | 69.8 |
| #35 | PaperQA2FutureHouse | 69.1 |
| #36 | txtaiNeuML | 68.9 |
| #37 | MarqoMarqo Pty Ltd | 68 |
| #38 | OpenSearch VectorOpenSearch Project (AWS-led) | 68 |
| #39 | ParadeDBParadeDB Inc. (YC S23) | 67.8 |
| #40 | Vespa AIYahoo / Vespa.ai (independent OSS project) | 67.7 |
| #41 | Mixedbread AIMixedbread AI | 67.5 |
| #42 | ValdYahoo Japan | 67.5 |
| #43 | SelfmemTsinghua / Microsoft (Cheng et al.) | 67.1 |
| #44 | Cohere EmbedCohere Inc. | 67 |
| #45 | vectorizeVectorize Inc. | 66.6 |
| #46 | AtlasMeta AI FAIR (Izacard et al.) | 66.3 |
| #47 | ColPaliilluin-tech | 66 |
| #48 | RETRODeepMind (Borgeaud et al.) | 65.7 |
| #49 | PrivateGPTZylon AI | 65.6 |
| #50 | VectaraVectara Inc. | 65.2 |
| #51 | VerbaWeaviate | 65 |
| #52 | Manticore SearchManticore Software Ltd. | 64.9 |
| #53 | MyScaleMyScale Inc. | 64.7 |
| #54 | Supabase VectorSupabase Inc. | 64.5 |
| #55 | TurboPufferTurboPuffer Inc. | 64.5 |
| #56 | Unstructured IOUnstructured Technologies Inc. | 64.5 |
| #57 | SingleStore VectorSingleStore Inc. | 64.3 |
| #58 | Elasticsearch VectorElastic N.V. | 64.2 |
| #59 | Carbon AICarbon (acquired by Perplexity, Dec 2024) | 64.1 |
| #60 | Neon VectorNeon Inc. | 64 |
| #61 | LlamaCloudLlamaIndex Inc. | 63.7 |
| #62 | MendableMendable (YC-backed) | 63.7 |
| #63 | Astra DBDataStax | 63.6 |
| #64 | CognitaTrueFoundry | 63.6 |
| #65 | Voyage AIVoyage AI (acquired by MongoDB, Feb 2025) | 63.4 |
| #66 | pgvector Supabase Neonpgvector OSS / Supabase Inc. / Neon Inc. | 63.2 |
| #67 | REALMGoogle Research (Guu et al.) | 63.1 |