Vellum AI

by Vellum AI Inc. (YC W23)

System Card

OrganizationVellum AI Inc. (YC W23)

Released2023-02

Architectureagentic-workflow / LLM workflow and evaluation platform

DetailsVellum provides an enterprise platform for building, evaluating, and deploying LLM workflows (including RAG pipelines). Features include prompt versioning, A/B testing, workflow orchestration with a visual editor, regression testing suites, and production monitoring with human-in-the-loop feedback. Raised $25.5M total ($5M seed, $20M Series A). YC W23 company.

Parameters—

Domainrag-retrievalagent-memory

Open SourceNo

WebsiteVisit

prompt-managementworkflow-orchestrationevaluationA/B-testingenterprise

Capability Profile

Benchmark Scores

6 of 14 benchmarks

Data Transparency:6 estimated

Long-Context Retrieval

1/5

RULER

no data

NIAH

no data

LooGLE

no data

LongBench

603pEstimated

∞Bench

no data

Multi-Turn Recall

1/2

LoCoMo

74.644pEstimated

MemoryBank

no data

Cross-Session Memory

1/1

LongMemEval

77.460pEstimated

Multi-Hop QA

2/3

BABILong

no data

MultiHop-RAG

7579pEstimated

HotpotQA

75.171pEstimated

Agent Task Memory

1/1

AgentBench-Mem

7226pEstimated

Personalization

0/1

PerLTQA

no data

Factuality / Grounding

0/1

RAGAS

no data

Sources:Arena estimate — derived from capability profile, not independently verified