Back to Arena
ColPali
by illuin-tech
System Card
Organizationilluin-tech
Released2024-06
Architecturevector-rag / Multi-vector VLM page embeddings
DetailsUses PaliGemma/Qwen-VL patch outputs + linear projection + ColBERT-style late interaction to embed whole document images. Removes OCR/layout pipelines. Includes ColPali, ColQwen, ColSmol variants.
Parameters—
Domainrag-retrieval
Open SourceYes
PaperView Paper
WebsiteVisit
CodeRepository
vidoremulti-vectorcolbertvlm
Capability Profile
Benchmark Scores
5 of 14 benchmarksMulti-Turn Recall0/2
LoCoMo
no dataMemoryBank
no dataCross-Session Memory0/1
LongMemEval
no dataMulti-Hop QA2/3
Agent Task Memory0/1
AgentBench-Mem
no dataPersonalization0/1
PerLTQA
no dataFactuality / Grounding1/1
Sources:ColPali paper (arXiv:2407.01449); evaluated on HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (Stanford / CMU, 1809)ColPali paper (arXiv:2407.01449); evaluated on LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding (Tsinghua KEG, 2308)ColPali paper (arXiv:2407.01449); evaluated on MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries (HKUST, 2401)ColPali paper (arXiv:2407.01449); evaluated on RAGAS: Automated Evaluation of Retrieval-Augmented Generation (Exploding Gradients, 2309)ColPali paper (arXiv:2407.01449); evaluated on RULER: What's the Real Context Size of Your Long-Context Language Models (NVIDIA, 2404)