Back to Arena
Marker
by Datalab (datalab-to)
System Card
OrganizationDatalab (datalab-to)
Released2023-10
Architectureknowledge-base / Document-to-markdown/JSON pipeline
DetailsConverts PDF/image/DOCX/XLSX/PPTX/HTML/EPUB to markdown/JSON/chunks/HTML with table/equation/image extraction. Optional hybrid LLM boost for accuracy; GPU/CPU/MPS support.
Parameters—
Domainrag-retrieval
Open SourceYes
WebsiteVisit
CodeRepository
pdfmarkdownstructured-extractiondatalab
Capability Profile
Benchmark Scores
5 of 14 benchmarksData Transparency:5 estimated
Long-Context Retrieval2/5
Multi-Turn Recall0/2
LoCoMo
no dataMemoryBank
no dataCross-Session Memory0/1
LongMemEval
no dataMulti-Hop QA2/3
Agent Task Memory0/1
AgentBench-Mem
no dataPersonalization0/1
PerLTQA
no dataFactuality / Grounding1/1