Vectara

by Vectara Inc.

System Card

OrganizationVectara Inc.

Released2022-10

Architecturevector-rag / grounded generation RAG-as-a-service

DetailsVectara coined the term "Grounded Generation" (now broadly called RAG) and provides a fully managed pipeline: ingest documents, chunk and embed internally, then serve with a dedicated retrieval API that returns citations. The platform uses its own Boomerang embedding model and a factual-consistency scorer to reduce hallucinations. Mockingbird, a task-specific LLM, was launched in 2024 specifically for RAG synthesis.

Parameters—

Domainrag-retrieval

Open SourceNo

WebsiteVisit

managed-raggrounded-generationanti-hallucinationcitationsenterprise

Capability Profile

Benchmark Scores

5 of 14 benchmarks

Long-Context Retrieval

2/5

RULER

66.410p

NIAH

no data

LooGLE

no data

LongBench

603p

∞Bench

no data

Multi-Turn Recall

0/2

LoCoMo

no data

MemoryBank

no data

Cross-Session Memory

0/1

LongMemEval

no data

Multi-Hop QA

2/3

BABILong

no data

MultiHop-RAG

64.923p

HotpotQA

58.88p

Agent Task Memory

0/1

AgentBench-Mem

no data

Personalization

0/1

PerLTQA

no data

Factuality / Grounding

1/1

RAGAS

65.225p

Sources:Vectara vendor documentation; evaluated on HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (Stanford / CMU, 1809)Vectara vendor documentation; evaluated on LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding (Tsinghua KEG, 2308)Vectara vendor documentation; evaluated on MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries (HKUST, 2401)Vectara vendor documentation; evaluated on RAGAS: Automated Evaluation of Retrieval-Augmented Generation (Exploding Gradients, 2309)Vectara vendor documentation; evaluated on RULER: What's the Real Context Size of Your Long-Context Language Models (NVIDIA, 2404)