RULER

Name: RULER: What's the Real Context Size of Your Long-Context Language Models
Creator: NVIDIA
Keywords: long-context-retrieval, long-context

RULER: What's the Real Context Size of Your Long-Context Language Models

Benchmark Metadata

PublisherNVIDIA

VenueCOLM 2024

Evaluation Typeautomatic

Dimensions13

Test Prompts4,000

ScoringHigher is better

Update Frequencyannual

PaperView Paper

LeaderboardView Leaderboard

What It Measures

Single and multi-key needle retrieval
Variable tracking
Common and frequent word extraction
Question answering with long contexts
Effective context length

What It Does Not Measure

Multi-session consistency
Personalization
Generation quality

All Systems Evaluated(71 systems)

1 self-reported70 estimated

Rank	System	Score	Provenance	Source
#1	Titanslucidrains (community) / paper by Google Research	97.9	Self-Reported	arXiv:2501.00663 Table 2 — Titans (MAC) avg on RULER S-NIAH-PK/N/W at 2K/4K/8K/16K
#2	PineconePinecone Systems	82.5	Estimated	Arena estimate — derived from capability profile, not independently verified
#3	WeaviateWeaviate	81.5	Estimated	Arena estimate — derived from capability profile, not independently verified
#4	Recurrent Memory TransformerMIPT / DeepPavlov (Bulatov, Kuratov, Burtsev)	79.5	Estimated	Arena estimate — derived from capability profile, not independently verified
#5	MilvusZilliz	78.2	Estimated	Arena estimate — derived from capability profile, not independently verified
#6	QdrantQdrant	77.9	Estimated	Arena estimate — derived from capability profile, not independently verified
#7	MambaCMU / Princeton (Gu, Dao)	77.8	Estimated	Arena estimate — derived from capability profile, not independently verified
#8	Jina AI EmbeddingsJina AI GmbH	76.1	Estimated	Arena estimate — derived from capability profile, not independently verified
#9	RWKVRWKV Foundation / BlinkDL community	75.8	Estimated	Arena estimate — derived from capability profile, not independently verified
#10	txtaiNeuML	75.7	Estimated	Arena estimate — derived from capability profile, not independently verified
#11	Landmark AttentionEPFL (Mohtashami, Jaggi)	75.6	Estimated	Arena estimate — derived from capability profile, not independently verified
#12	Compressive TransformerDeepMind (Rae et al.)	75.4	Estimated	Arena estimate — derived from capability profile, not independently verified
#13	Cohere EmbedCohere Inc.	75	Estimated	Arena estimate — derived from capability profile, not independently verified
#14	MiniRAGHKUDS	74.8	Estimated	Arena estimate — derived from capability profile, not independently verified
#15	LM-InfiniteIllinois / Meta (Han et al.)	74.5	Estimated	Arena estimate — derived from capability profile, not independently verified
#16	H2OUT Austin / Rice / CMU / Stanford / Meta (Zhang et al.)	74.4	Estimated	Arena estimate — derived from capability profile, not independently verified
#17	ChromaChroma	74.2	Estimated	Arena estimate — derived from capability profile, not independently verified
#18	MarkerDatalab (datalab-to)	74.1	Estimated	Arena estimate — derived from capability profile, not independently verified
#19	AllegroGraphFranz Inc.	74	Estimated	Arena estimate — derived from capability profile, not independently verified
#20	PathRAGBUPT-GAMMA	73.6	Estimated	Arena estimate — derived from capability profile, not independently verified
#21	TRIMEPrinceton NLP (Zhong, Lei, Chen)	73.6	Estimated	Arena estimate — derived from capability profile, not independently verified
#22	LanceDBLanceDB Inc. (YC S22)	73.3	Estimated	Arena estimate — derived from capability profile, not independently verified
#23	R2RSciPhi-AI	73.3	Estimated	Arena estimate — derived from capability profile, not independently verified
#24	ScissorhandsRice / Stanford / Meta (Liu et al.)	73.2	Estimated	Arena estimate — derived from capability profile, not independently verified
#25	∞ FormerInstituto de Telecomunicações / DeepMind / IST (Martins, Marinho, Martins)	72.7	Estimated	Arena estimate — derived from capability profile, not independently verified
#26	KAGOpenSPG / Ant Group	72.5	Estimated	Arena estimate — derived from capability profile, not independently verified
#27	GraphRAGMicrosoft	72.4	Estimated	Arena estimate — derived from capability profile, not independently verified
#28	ICAEMicrosoft Research (Ge et al.)	72.3	Estimated	Arena estimate — derived from capability profile, not independently verified
#29	Neo4j LLM Graph BuilderNeo4j Labs	72.3	Estimated	Arena estimate — derived from capability profile, not independently verified
#30	Mixedbread AIMixedbread AI	71.9	Estimated	Arena estimate — derived from capability profile, not independently verified
#31	Voyage AIVoyage AI (acquired by MongoDB, Feb 2025)	71.9	Estimated	Arena estimate — derived from capability profile, not independently verified
#32	Ontotext GraphDBOntotext / Graphwise (merged with Semantic Web Company, 2025)	71.6	Estimated	Arena estimate — derived from capability profile, not independently verified
#33	Supabase VectorSupabase Inc.	71.4	Estimated	Arena estimate — derived from capability profile, not independently verified
#34	Nano GraphRAGgusye1234	71.3	Estimated	Arena estimate — derived from capability profile, not independently verified
#35	DiffbotDiffbot Inc.	71.1	Estimated	Arena estimate — derived from capability profile, not independently verified
#36	SingleStore VectorSingleStore Inc.	71	Estimated	Arena estimate — derived from capability profile, not independently verified
#37	Activeloop Deep LakeActiveloop Inc.	70.7	Estimated	Arena estimate — derived from capability profile, not independently verified
#38	LightRAGHKUDS (HKU Data Intelligence Lab)	70.7	Estimated	Arena estimate — derived from capability profile, not independently verified
#39	RAPTORStanford (Sarthi, Abdullah et al.)	70.1	Estimated	Arena estimate — derived from capability profile, not independently verified
#40	AtlasMeta AI FAIR (Izacard et al.)	69.9	Estimated	Arena estimate — derived from capability profile, not independently verified
#41	StardogStardog Union Inc.	69.9	Estimated	Arena estimate — derived from capability profile, not independently verified
#42	Unstructured IOUnstructured Technologies Inc.	69.9	Estimated	Arena estimate — derived from capability profile, not independently verified
#43	MyScaleMyScale Inc.	69.5	Estimated	Arena estimate — derived from capability profile, not independently verified
#44	Elasticsearch VectorElastic N.V.	69.4	Estimated	Arena estimate — derived from capability profile, not independently verified
#45	ValdYahoo Japan	69.4	Estimated	Arena estimate — derived from capability profile, not independently verified
#46	pgvector Supabase Neonpgvector OSS / Supabase Inc. / Neon Inc.	69.3	Estimated	Arena estimate — derived from capability profile, not independently verified
#47	PrivateGPTZylon AI	69.3	Estimated	Arena estimate — derived from capability profile, not independently verified
#48	OpenSearch VectorOpenSearch Project (AWS-led)	69.2	Estimated	Arena estimate — derived from capability profile, not independently verified
#49	PaperQA2FutureHouse	69.1	Estimated	Arena estimate — derived from capability profile, not independently verified
#50	RETRODeepMind (Borgeaud et al.)	68.9	Estimated	Arena estimate — derived from capability profile, not independently verified
#51	Neon VectorNeon Inc.	68.8	Estimated	Arena estimate — derived from capability profile, not independently verified
#52	TrustRAGGoMate Community	68.7	Estimated	Arena estimate — derived from capability profile, not independently verified
#53	REALMGoogle Research (Guu et al.)	68.6	Estimated	Arena estimate — derived from capability profile, not independently verified
#54	SelfmemTsinghua / Microsoft (Cheng et al.)	68.4	Estimated	Arena estimate — derived from capability profile, not independently verified
#55	Carbon AICarbon (acquired by Perplexity, Dec 2024)	68.3	Estimated	Arena estimate — derived from capability profile, not independently verified
#56	GraphRAG-SDKFalkorDB	68.3	Estimated	Arena estimate — derived from capability profile, not independently verified
#57	LlamaCloudLlamaIndex Inc.	67.7	Estimated	Arena estimate — derived from capability profile, not independently verified
#58	MarqoMarqo Pty Ltd	67.6	Estimated	Arena estimate — derived from capability profile, not independently verified
#59	Astra DBDataStax	67.1	Estimated	Arena estimate — derived from capability profile, not independently verified
#60	Activation BeaconBAAI / Renmin University (Zhang et al.)	66.7	Estimated	Arena estimate — derived from capability profile, not independently verified
#61	Vespa AIYahoo / Vespa.ai (independent OSS project)	66.6	Estimated	Arena estimate — derived from capability profile, not independently verified
#62	vectorizeVectorize Inc.	66.5	Estimated	Arena estimate — derived from capability profile, not independently verified
#63	Manticore SearchManticore Software Ltd.	66.4	Estimated	Arena estimate — derived from capability profile, not independently verified
#64	VectaraVectara Inc.	66.4	Estimated	Arena estimate — derived from capability profile, not independently verified
#65	ColPaliilluin-tech	66.2	Estimated	Arena estimate — derived from capability profile, not independently verified
#66	ParadeDBParadeDB Inc. (YC S23)	66.1	Estimated	Arena estimate — derived from capability profile, not independently verified
#67	CognitaTrueFoundry	66	Estimated	Arena estimate — derived from capability profile, not independently verified
#68	VerbaWeaviate	66	Estimated	Arena estimate — derived from capability profile, not independently verified
#69	TurboPufferTurboPuffer Inc.	65.6	Estimated	Arena estimate — derived from capability profile, not independently verified
#70	MendableMendable (YC-backed)	64.5	Estimated	Arena estimate — derived from capability profile, not independently verified
#71	StreamingLLMMIT Han Lab / Meta AI (Xiao et al.)	57.2	Estimated	Arena estimate — derived from capability profile, not independently verified