Back to Arena

Marker

by Datalab (datalab-to)

System Card

OrganizationDatalab (datalab-to)
Released2023-10
Architectureknowledge-base / Document-to-markdown/JSON pipeline
DetailsConverts PDF/image/DOCX/XLSX/PPTX/HTML/EPUB to markdown/JSON/chunks/HTML with table/equation/image extraction. Optional hybrid LLM boost for accuracy; GPU/CPU/MPS support.
Parameters
Domainrag-retrieval
Open SourceYes
WebsiteVisit
pdfmarkdownstructured-extractiondatalab

Capability Profile

Benchmark Scores

5 of 14 benchmarks
Long-Context Retrieval
2/5
RULER
74.175p
NIAH
no data
LooGLE
no data
∞Bench
no data
Multi-Turn Recall
0/2
LoCoMo
no data
MemoryBank
no data
Cross-Session Memory
0/1
LongMemEval
no data
Multi-Hop QA
2/3
BABILong
no data
HotpotQA
76.177p
Agent Task Memory
0/1
AgentBench-Mem
no data
Personalization
0/1
PerLTQA
no data
Factuality / Grounding
1/1
RAGAS
71.870p