Back to Benchmarks
LongMemEval
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
Benchmark Metadata
PublisherSalesforce AI Research
VenuearXiv preprint
Evaluation Typeautomatic
Dimensions5
Test Prompts500
ScoringHigher is better
Update Frequencyannual
PaperView Paper
LeaderboardView Leaderboard
What It Measures
- Information extraction across sessions
- Multi-session reasoning
- Knowledge update tracking
- Temporal reasoning
- Abstention on missing facts
What It Does Not Measure
- Single-turn factual recall
- Latency
- Token-cost efficiency
- Open-ended generation quality
All Systems Evaluated(173 systems)
19 self-reported154 estimated
| Rank | System | Score |
|---|---|---|
| #1 | MemPalaceBen Sigman / Milla Jovovich (independent open-source) | 96.6 |
| #2 | Backboard IOBackboard.io | 93.4 |
| #3 | Lyzr CognisLyzr AI | 90.6 |
| #4 | VoyagerNVIDIA / Caltech / UT Austin / Stanford / ASU / UW (Wang et al.) | 87.1 |
| #5 | Pickle AISoul Computer (YC-backed) | 86.8 |
| #6 | xmemoryxmemory Inc. | 86.6 |
| #7 | ArcMemoUC Berkeley / Stanford (Ho et al.) | 85.1 |
| #8 | SuperAGITransformerOptimus | 85.1 |
| #9 | ReplikaLuka, Inc. | 84.9 |
| #10 | Swarmskyegomez / Swarms Corp | 84 |
| #11 | MemR32025 (December submission) | 83.9 |
| #12 | OS-Copilot / FRIDAYShanghai AI Lab / MMLab (Wu et al.) | 83.3 |
| #13 | A-MEMAGI Research / Rutgers | 83.1 |
| #14 | HippoRAG 2OSU NLP Group | 83 |
| #15 | Talkie AIMiniMax | 82.8 |
| #16 | CognigyCognigy GmbH (acquired by NICE, July 2025) | 82.1 |
| #17 | CrewAI EnterpriseCrewAI Inc. | 82.1 |
| #18 | MempZhejiang University (Fang et al.) | 82.1 |
| #19 | memUNevaMind-AI | 82 |
| #20 | Bee ComputerBee (acquired by Amazon 2026) | 81.9 |
| #21 | SupermemorySupermemory | 81.6 |
| #22 | MoTFudan University (Li & Qiu) | 81.5 |
| #23 | MIRIXMIRIX AI (Wang, Chen) | 81.3 |
| #24 | AutoWebGLMTHUDM | 81.1 |
| #25 | ExpeLTsinghua University (Zhao et al.) | 81 |
| #26 | BabyAGIYohei Nakajima | 80.9 |
| #27 | BrowserGymServiceNow Research | 80.7 |
| #28 | Suki AISuki (formerly Robin AI) | 80.7 |
| #29 | AgentVerseOpenBMB (Tsinghua) | 80.6 |
| #30 | Lindy AILindy AI | 80.5 |
| #31 | LagentInternLM (Shanghai AI Lab) | 80.4 |
| #32 | Nabla CopilotNabla | 80.4 |
| #33 | LangflowLangflow-ai (DataStax) | 80.1 |
| #34 | Mobile-AgentAlibaba Tongyi Lab (X-PLUG) | 80.1 |
| #35 | CradleBAAI-Agents | 80 |
| #36 | HebbiaHebbia, Inc. | 79.9 |
| #37 | JARVIS-1CraftJarvis | 79.8 |
| #38 | VoiceflowVoiceflow Inc. | 79.8 |
| #39 | Kore AIKore.ai Inc. | 79.7 |
| #40 | Generative AgentsStanford University / Google Research | 79.6 |
| #41 | Onyxonyx-dot-app | 79.6 |
| #42 | WebVoyagerMinorJerry et al. | 79.5 |
| #43 | AGiXTJosh-XT | 79.4 |
| #44 | Bishengdataelement | 79.4 |
| #45 | CAMELCAMEL-AI.org | 79.4 |
| #46 | Nuance DAXNuance Communications (Microsoft) | 79.3 |
| #47 | SynapseNanyang Technological University (Zheng et al.) | 79.3 |
| #48 | ChatDev 2.0OpenBMB | 79.2 |
| #49 | ReflexionNortheastern / MIT / Princeton (Shinn et al.) | 79.2 |
| #50 | Tab AITab (Avi Schiffmann) | 79.2 |
| #51 | Self-RAGUniversity of Washington / Allen AI (Asai et al.) | 79.1 |
| #52 | AutoGPT PlatformSignificant Gravitas | 79 |
| #53 | Stack AIStack AI Inc. (YC W23) | 79 |
| #54 | Galileo AIGalileo Technologies Inc. | 78.9 |
| #55 | VectorShiftVectorShift Inc. (YC S23) | 78.8 |
| #56 | Agent Workflow MemoryCMU (Wang, Mao, Fried, Neubig) | 78.7 |
| #57 | HiMemZhu et al. (JD.com, 2026) | 78.4 |
| #58 | Limitless PendantLimitless AI (acquired by Meta Dec 2025) | 78.4 |
| #59 | Athina AIAthina AI (YC W23) | 78.3 |
| #60 | HoneyHiveHoneyHive Inc. | 78.3 |
| #61 | SCMBeihang / NLPR (Wang et al.) | 78.2 |
| #62 | GPTeam101dotxyz | 77.9 |
| #63 | AutoGen Core MemoryMicrosoft | 77.8 |
| #64 | MemOSMemTensor (Li, Zhang, et al.) | 77.8 |
| #65 | DB-GPTeosphoros-ai | 77.7 |
| #66 | MetaGPTDeepWisdom / geekan | 77.7 |
| #67 | Qwen-AgentQwenLM (Alibaba) | 77.7 |
| #68 | Character AICharacter.AI (Google investment) | 77.5 |
| #69 | FastGPTlabring | 77.4 |
| #70 | Vellum AIVellum AI Inc. (YC W23) | 77.4 |
| #71 | D-MemYou et al. (2025) | 77.3 |
| #72 | SID AISID (YC) | 77.1 |
| #73 | Friend AIFriend | 77 |
| #74 | DifyLangGenius | 76.9 |
| #75 | LangGraphLangChain | 76.9 |
| #76 | CrewAICrewAI Inc. (Joao Moura) | 76.8 |
| #77 | Think-in-MemoryAnt Group / Alibaba (Liu et al.) | 76.8 |
| #78 | RecallMCisco Research / independent (Kynoch & Latapie) | 76.7 |
| #79 | Nomi AIGlimpse AI, Inc. | 76.6 |
| #80 | GAMVectorSpaceLab (BAAI-related) | 76.5 |
| #81 | LangSmith LangGraph CloudLangChain Inc. | 76.4 |
| #82 | Open InterpreterOpenInterpreter | 76.3 |
| #83 | MultiOnMultiOn (now AGI Inc.) | 76.1 |
| #84 | Personal AIPersonal AI | 75.9 |
| #85 | Pi InflectionInflection AI | 75.6 |
| #86 | MemaryKingjulio8238 | 75.5 |
| #87 | AutoGen StudioMicrosoft Research | 75.5 |
| #88 | BotpressBotpress Inc. | 75.4 |
| #89 | FlowiseFlowiseAI | 75.3 |
| #90 | MemoryLLMUCSD / Apple (Wang et al.) | 75.2 |
| #91 | AbridgeAbridge | 75.1 |
| #92 | HybridAGISynaLinks | 75.1 |
| #93 | Ontotext GraphDBOntotext / Graphwise (merged with Semantic Web Company, 2025) | 75.1 |
| #94 | Dust ttDust (formerly XP1) | 75 |
| #95 | Titanslucidrains (community) / paper by Google Research | 75 |
| #96 | Maxim AIMaxim AI Inc. | 74.9 |
| #97 | Second MeMindverse (Shang, Li, et al.) | 74.9 |
| #98 | ParadotWithFeeling.AI | 74.7 |
| #99 | KnowAgentzjunlp (Zhejiang University) | 74.6 |
| #100 | NemoriNemori AI (independent) | 74.6 |
| #101 | MoTFudan (Li, Qiu) | 74.4 |
| #102 | Plaud NotePLAUD | 74.4 |
| #103 | AriGraphAIRI Institute / Moscow | 74.2 |
| #104 | Granola AIGranola | 74.1 |
| #105 | SynapseNTU / Salesforce (Zheng et al.) | 74 |
| #106 | Charlie MnemonicGoodAI | 73.8 |
| #107 | LightRAGHKUDS (HKU Data Intelligence Lab) | 73.8 |
| #108 | Nano GraphRAGgusye1234 | 73.8 |
| #109 | PathRAGBUPT-GAMMA | 73.8 |
| #110 | GleanGlean Technologies | 73.7 |
| #111 | Memoripycaspianmoon | 73.6 |
| #112 | HippoRAGOSU NLP Group (Ohio State University) | 73.5 |
| #113 | AppAgentTencent / mnotgod96 | 73.4 |
| #114 | StardogStardog Union Inc. | 73.4 |
| #115 | Neo4j LLM Graph BuilderNeo4j Labs | 73.3 |
| #116 | GraphitiZep AI | 73.2 |
| #117 | MCP Memory ServerAnthropic / Model Context Protocol | 73.1 |
| #118 | MiniRAGHKUDS | 73.1 |
| #119 | ChatDBTsinghua University (Hu et al.) | 72.4 |
| #120 | R2RSciPhi-AI | 72.4 |
| #121 | Neo4j AuraDBNeo4j Inc. | 72.3 |
| #122 | GraphRAGMicrosoft | 71.8 |
| #123 | KAGOpenSPG / Ant Group | 71.5 |
| #124 | DiffbotDiffbot Inc. | 71.3 |
| #125 | RAGFlowInfiniFlow | 71.3 |
| #126 | ZepZep AI | 71.2 |
| #127 | HuggingGPT / JARVISMicrosoft Research | 71.2 |
| #128 | Generative AgentsStanford / Google | 71.1 |
| #129 | MemoChatUniversity of Warwick / Alibaba | 71.1 |
| #130 | GraphRAG-SDKFalkorDB | 70.9 |
| #131 | AllegroGraphFranz Inc. | 70.7 |
| #132 | Memorizing TransformerGoogle Research (Wu, Rabe, Hutchins, Szegedy) | 70.6 |
| #133 | KindroidKindroid | 70.4 |
| #134 | RMMGoogle / UCSB (2025) | 70.4 |
| #135 | LarimarIBM Research | 68.7 |
| #136 | kNN-LMStanford / Facebook AI Research (Khandelwal et al.) | 68.1 |
| #137 | HEMAindependent (Ahn et al.) | 66.8 |
| #138 | LongMemUCSB / Microsoft Research | 65.9 |
| #139 | Nomic AtlasNomic AI Inc. | 64.6 |
| #140 | EM-LLMem-llm (academic consortium) | 64.2 |
| #141 | Claude ProjectsAnthropic | 64 |
| #142 | MemoryScopeAlibaba ModelScope | 63.7 |
| #143 | Haystack Memorydeepset | 63.2 |
| #144 | WeaviateWeaviate | 63.1 |
| #145 | Gemini MemoryGoogle | 62.3 |
| #146 | ChatGPT MemoryOpenAI | 61.5 |
| #147 | MnemosyneJohns Hopkins / independent (2025) | 61.4 |
| #148 | MemoroMIT Media Lab | 60.6 |
| #149 | QuivrQuivrHQ | 60.6 |
| #150 | LettaLetta (formerly MemGPT) | 60.4 |
| #151 | MemoryBankInstitute of Software, Chinese Academy of Sciences | 60.1 |
| #152 | Mnemosyneindependent | 59.9 |
| #153 | Copilot MemoryMicrosoft | 59.7 |
| #154 | Sana AISana Labs | 59.7 |
| #155 | Heyday AIHeyday (shut down 2025) | 59.2 |
| #156 | AnythingLLMMintplex Labs | 59.1 |
| #157 | REALMGoogle Research (Guu et al.) | 59.1 |
| #158 | Saner AISaner.AI | 59.1 |
| #159 | Redis VectorRedis Ltd. | 58.8 |
| #160 | LangMemLangChain | 58.3 |
| #161 | EpsillaEpsilla Inc. (YC S23) | 58.3 |
| #162 | Couchbase VectorCouchbase Inc. | 58.2 |
| #163 | RagieRagie Inc. | 57.7 |
| #164 | MemoryBankHarbin Institute of Technology / SenseTime | 57.5 |
| #165 | MongoDB Atlas VectorMongoDB Inc. | 57 |
| #166 | LlamaIndex MemoryLlamaIndex | 56.8 |
| #167 | KDB AIKX Systems | 56.8 |
| #168 | AtlasMeta AI FAIR (Izacard et al.) | 56.6 |
| #169 | Notion AINotion Labs | 55.6 |
| #170 | Mem AIMem Labs | 55.5 |
| #171 | MemoriGibsonAI | 54.8 |
| #172 | MemGPT ClassicBerkeley / Letta | 52.4 |
| #173 | Memory³Institute for Advanced Algorithms Research Shanghai / Peking University | 51.3 |