Back to Arena

Vellum AI

by Vellum AI Inc. (YC W23)

System Card

OrganizationVellum AI Inc. (YC W23)
Released2023-02
Architectureagentic-workflow / LLM workflow and evaluation platform
DetailsVellum provides an enterprise platform for building, evaluating, and deploying LLM workflows (including RAG pipelines). Features include prompt versioning, A/B testing, workflow orchestration with a visual editor, regression testing suites, and production monitoring with human-in-the-loop feedback. Raised $25.5M total ($5M seed, $20M Series A). YC W23 company.
Parameters
Domainrag-retrievalagent-memory
Open SourceNo
WebsiteVisit
prompt-managementworkflow-orchestrationevaluationA/B-testingenterprise

Capability Profile

Benchmark Scores

6 of 14 benchmarks
Long-Context Retrieval
1/5
RULER
no data
NIAH
no data
LooGLE
no data
∞Bench
no data
Multi-Turn Recall
1/2
LoCoMo
74.643p
MemoryBank
no data
Cross-Session Memory
1/1
Multi-Hop QA
2/3
BABILong
no data
HotpotQA
75.170p
Agent Task Memory
1/1
Personalization
0/1
PerLTQA
no data
Factuality / Grounding
0/1
RAGAS
no data