VOYAGE AI · TECH

Voyage AI: specialised embedding API for RAG setups

Voyage AI is an embedding provider acquired by MongoDB in 2024. voyage-3 costs USD 0.06 per 1M tokens and ranks among the strongest RAG models in May 2026.

Researched & fact-checked by: DuneDive LLC · As of: 2026-05

What is Voyage AI?

Voyage AI is a US company founded in 2023 from the Stanford NLP research circle, specialising exclusively in embeddings and rerankers for retrieval tasks. In February 2026 Voyage AI was acquired by MongoDB – part of their strategy to make vector search a first-class feature in Atlas. Despite the acquisition the API remains usable independently at voyageai.com and is also available via AWS Bedrock in the Frankfurt region (eu-central-1).

The model family includes (as of May 2026): voyage-3 as generalist flagship (1024 dimensions, USD 0.06 per 1M tokens), voyage-3-lite as low-cost variant (512 dimensions, USD 0.02 per 1M tokens), voyage-multimodal-3 with text-and-image understanding, voyage-code-3 for code repositories, voyage-law-2 for legal text (with explicit training on US case law, contracts, and statutes), and voyage-finance-2 for finance documents. This domain specialisation is a unique selling point in May 2026 – no other embedding provider offers four domain-specific models at this quality.

Voyage AI focuses on something many generalists do not nail: retrieval quality on real RAG loads. The training regime uses a lot of MS-MARCO-style web data plus proprietary curation. On BEIR and on the 2025/2026 MTEB retrieval splits voyage-3 sits almost always in the top 3, often level with Cohere embed-v3 and ahead of OpenAI text-embedding-3-large. For purely English RAG setups Voyage is often the soberly best choice in May 2026.

Why it matters for Switzerland

Three arguments matter for Swiss mandates. First, price/performance. voyage-3 costs USD 0.06 per 1M tokens – a third of OpenAI text-embedding-3-large at markedly better retrieval quality. A fiduciary with 50,000 documents averaging 1000 tokens each pays a one-off USD 3 for the entire corpus. Even with continuous re-ingestion embedding cost stays below CHF 30 per year.

Second, the domain models. A law firm with English-speaking clients can use voyage-law-2, a legally trained embedding model optimised for case law and contract clauses. On legal retrieval benchmarks (CaseHOLD, ContractNLI) it delivers 10-15 percent more Recall@5 than a generalist model. The same applies to voyage-finance-2 for finance data – relevant for wealth managers and family offices. The domain models are a rare specialty that OpenAI and Cohere do not yet offer.

Third – and this is the key point for Swiss compliance – EU hosting via AWS Bedrock. Since mid-2025 Voyage AI is a Bedrock foundation model in eu-central-1 (Frankfurt). The standard AWS DPA applies and data does not leave the EU. Using the standard Voyage endpoint means US data flow (Voyage runs US servers); going through Bedrock means EU hosting. This distinction is critical in DPA wording and should be documented explicitly.

For mandates under strict nFADP interpretation or under SCC Art. 321 professional secrecy, Voyage AI direct remains problematic because the provider is a US company (even after the MongoDB acquisition). EU AI Act conformity is given via the Bedrock path, but a professional-secrecy-compliant solution remains self-hosted BGE-M3 or mxbai.

How it works

The Voyage AI API follows the familiar OpenAI-like convention: a POST endpoint that accepts a list of strings and returns a list of vectors plus token usage. Auth via API key in the Authorization header.

A typical integration:

```python import voyageai

client = voyageai.Client(api_key="voyage-xxx")

documents = [ "Client claims damages for breach of contract.", "Le client reclame des dommages-interets pour rupture de contrat.", ]

resp = client.embed( texts=documents, model="voyage-3", input_type="document", )

vectors = resp.embeddings # 2 x 1024 list of floats ```

The input_type argument matters: documents embed with input_type="document", search queries with input_type="query". Unlike multilingual-e5 which uses prefix strings, Voyage uses an API parameter. Forgetting it costs recall just as with E5.

Via AWS Bedrock the integration differs. Instead of voyageai.Client you use boto3 (AWS SDK) against the Bedrock endpoint in eu-central-1. The call schema is Bedrock-typical:

```python import boto3, json

bedrock = boto3.client("bedrock-runtime", region_name="eu-central-1")

body = json.dumps({ "texts": documents, "input_type": "document", }) resp = bedrock.invoke_model( modelId="voyage.voyage-3-v1", body=body, ) vectors = json.loads(resp["body"].read())["embeddings"] ```

Embeddings then physically reside in the Bedrock Frankfurt region. The Bedrock model catalogue can shift; as of May 2026 Voyage is a Bedrock foundation model in eu-central-1 and us-east-1 – check current per-region availability in the AWS console.

Model selection follows a simple heuristic: voyage-3 for all standard RAG cases, voyage-3-lite for very large corpora under storage pressure (half the dimensions save storage), voyage-multimodal-3 for setups with invoice scans or diagrams, voyage-law-2 or voyage-finance-2 for domain-specific cases. Mixed operation is fine – different Qdrant collections, each with the right model.

Voyage AI to production in 5 steps

01Pick the model: voyage-3 as default; voyage-3-lite under storage pressure; voyage-law-2 or voyage-finance-2 for a clear domain.
02Decide the hosting path: voyageai.com direct (US hosting) or AWS Bedrock eu-central-1 (EU hosting) – usually Bedrock for CH mandates.
03Set up the API key or IAM role, place the call wrapper in your codebase, always set input_type=query vs document.
04Create a Qdrant collection with dimension=1024 (voyage-3) or 512 (voyage-3-lite), distance=cosine, payload index on client_id and doc_type.
05Eval suite against baseline: 30-50 real Q/A pairs in the language and domain of the corpus, compare Recall@5 and nDCG@10, quantify the delta.

When to use Voyage AI

Voyage AI is the right choice when (a) retrieval quality is the priority and you accept an API provider, (b) a domain model (law or finance) delivers a recall advantage, (c) EU hosting via AWS Bedrock is acceptable, or (d) the combination with Voyage rerank-2 is wanted (very cheap and qualitatively strong).

Concrete cases: an international audit firm with English-speaking clients building a RAG assistant over audit reports and liability memos. A law firm with US-focused practice using voyage-law-2 for case-law search. A wealth manager with a family-office setup using voyage-finance-2 for internal investment research. An SME with a large document corpus and cost focus using the Voyage lite variant for standard RAG.

For Swiss SMEs with German-French mix Voyage AI in May 2026 is good but not clearly better than Cohere embed-multilingual-v3 or BGE-M3. On MTEB-DE voyage-3 sits roughly level with BGE-M3. Voyage really shines in English-dominated setups and the domain models.

When not to use

If you work under SCC Art. 321 professional secrecy and embeddings must strictly flow EU-only or CH-only, Voyage direct is not suitable – the provider sits in the US. Via AWS Bedrock Frankfurt EU hosting is possible but Swiss professional secrecy interpretation often prefers self-hosting. In that case BGE-M3 is the better choice.

If your corpus is mostly German and you do not need domain specialisation, Cohere embed-multilingual-v3 or BGE-M3 is at least equivalent and avoids the US binding. Voyage pays off in DE setups mainly when Voyage rerank is used too (very cheap in the combo).

If you have multilingual content with strong Italian or Romansh share, Voyage AI is not ideal – multilingual coverage is slightly narrower than BGE-M3 or Cohere. Italian is good, Romansh and Swiss German essentially absent.

If you want a zero-config cloud setup with the Vertex or OpenAI standard SDK, Voyage via Bedrock is an extra step – Bedrock setup, IAM policies, separate auth. Anyone who wants a pilot RAG as fast as possible and does not need the last percent of recall is productive with OpenAI text-embedding-3-small in 30 minutes.

Trade-offs

STRENGTHS

Very strong retrieval quality, top 3 on BEIR and MTEB-EN
Cheap: USD 0.06 per 1M tokens – a third of OpenAI large
Domain models for law and finance – unique selling point
EU hosting via AWS Bedrock Frankfurt available

WEAKNESSES

US provider – professional-secrecy mandates need the Bedrock detour
On German voyage-3 trails Cohere embed-multilingual-v3 slightly
input_type parameter is a frequent rookie mistake
Domain models are English-centric, German is secondary

FAQ

What does the MongoDB acquisition mean for API stability?

As of May 2026 the Voyage API at voyageai.com remains unchanged. MongoDB has committed to keeping Voyage as a standalone offering; integration into Atlas is an additional option, not a replacement. For larger mandates a contract addendum on API availability for at least three years is still worth requesting.

How big is the quality gap to Cohere embed-v3 on German?

In our May 2026 measurements on MTEB-DE: Cohere embed-multilingual-v3 leads by 1-2 nDCG@10 points. On English the order reverses – Voyage-3 leads Cohere by 1-3 points. So for pure DE setups prefer Cohere; for EN-heavy or mixed Voyage.

Are the domain models voyage-law-2 and voyage-finance-2 multilingual?

Limited. Both are primarily trained on US-English domain data; German is understood but not optimal. For Swiss law (Federal Court rulings, OR, ZGB) generalist voyage-3 is often better than voyage-law-2 on German. When English legal text dominates, voyage-law-2 beats everything else.

How high are rate limits on the standard API?

May 2026: 300 requests/minute and 1M tokens/minute on the default tier, raisable via sales. Via AWS Bedrock the Bedrock model quotas apply, typically more generous in eu-central-1. For initial ingestion of corpora above 1M documents a tier raise is sensible.

Sources

Voyage AI documentation – embeddings models and pricing · 2026-05
MongoDB acquires Voyage AI – press release · 2026-02
AWS Bedrock – Voyage foundation models availability · 2026-05
MTEB Leaderboard – Massive Text Embedding Benchmark · 2026-05

FITS YOUR STACK?

What this looks like in your business – a 30-minute intro call.

Book a call