VECTOR DATABASES · COMPARISON
Vector databases compared: 10 options for RAG, search, and recommendation
Qdrant, Weaviate, Milvus, Chroma, pgvector, Pinecone, Redis, Vespa, LanceDB and Elasticsearch in a neutral comparison, with hosting, license, and EU aspects.
Researched & fact-checked by: DuneDive LLC · As of: 2026-05
What a vector database is
A vector database stores high-dimensional embedding vectors – typically 384 to 3072 dimensions per entry – and finds nearest neighbours to a query vector in milliseconds. This approximate nearest neighbour (ANN) search underpins retrieval-augmented generation, semantic search, recommendation systems, and anomaly detection. A relational database can in theory solve the task, but past a few thousand entries it drops below one-second response without a specialised index.
As of May 2026, around two dozen production-grade vector databases are available. This page compares the ten options that show up most often in SME and fiduciary contexts: Qdrant, Weaviate, Milvus, Chroma, pgvector, Pinecone, Redis (with RediSearch), Vespa, LanceDB, and Elasticsearch with kNN. All ten handle ANN search via HNSW or comparable index structures; the differences sit in license, hosting model, filter power, EU residency, and integration with existing stacks.
The central selection question in Switzerland is rarely "which database is the fastest". It is: which database fits the existing infrastructure, the data-protection requirement, and the team knowledge. A Postgres instance already running answers the question differently than an empty data centre with Kubernetes expertise.
Why the choice matters
The vector database is one of the few stack elements that is hard to swap once selected. Embeddings, payload schema, filter indexes, and the ingestion pipeline get optimised for the chosen platform. A switch is feasible – migrating 1M vectors usually takes half a day – but it breaks the production pipeline and forces a re-test of all RAG answers.
Three consequences hit SME setups hardest. First license: pgvector and Qdrant are Apache-2.0 or PostgreSQL license, so private forks and commercial embedded use are unproblematic. Elasticsearch has carried Elastic License v2 / SSPL since 2021 – usable for self-hosting in a law firm or fiduciary, not usable for an Elasticsearch-based SaaS product. Pinecone is proprietary with no self-host option; switching forces migration.
Second hosting: only self-host options give full Swiss control over data location. Pinecone offers an EU region (eu-west-1) but remains a US company – for client data under the revised FADP, that warrants a transfer-impact assessment. Qdrant, Weaviate, Milvus, pgvector, Vespa, Redis, and LanceDB all run on Hetzner Helsinki or Falkenstein.
Third filters: payload-indexed filters are the difference between "hit in 50 ms" and "hit in 2 seconds". Qdrant and Weaviate evaluate filters inside the HNSW graph. Chroma and LanceDB filter after the top-k pass, which gives poor recall on selective filters (e.g. "only client 42"). Anyone needing multi-tenant separation should check this before committing.
How the ten options differ
The ten databases sort into four groups. First group: dedicated open-source vector DBs in Rust or C++ – Qdrant, Weaviate, Milvus. Built for ANN search, with the best latency profile at medium volume and clean multi-tenant filter support. Qdrant leads in Swiss self-hosted setups, Weaviate scores with GraphQL and native hybrid search, Milvus with GPU acceleration for very large corpora past 100M vectors.
Second group: extensions on existing databases – pgvector (Postgres), RediSearch (Redis), Elasticsearch kNN. These win when the base already runs. pgvector v0.8+ (May 2026) brings HNSW and IVFFlat, ACID transactions from Postgres, full text via tsvector in the same query – often the right choice for SMEs already on Postgres. RediSearch fits when Redis already lives as cache; Elasticsearch is the natural pick in setups that depend on hybrid keyword+vector search.
Third group: managed cloud – Pinecone as pure SaaS. Pinecone Serverless since 2024 removes cluster management; storage and compute are billed separately. For resource-constrained teams without DevOps capacity, Pinecone is fast to production – at the cost of a third-country transfer.
Fourth group: special profiles – Chroma (prototyping, DuckDB-based, simplest API), Vespa (Yahoo origin, very performant on combined structured+vector queries, steep learning curve), LanceDB (columnar Lance format, embedded in Python/JS, fits local apps and notebooks).
The technical core is similar across all ten: HNSW as default index, cosine/dot/Euclidean as distance metrics, top-k search with filters. The differences sit around the edges – cluster mode, backup, authentication, observability, driver maturity across languages.
Selection in 5 steps
- 01Estimate data volume: under 100,000 vectors -> pgvector or Chroma suffice; 100,000 to 50M -> dedicated DB; over 100M -> evaluate Milvus or Vespa.
- 02Clarify the hosting constraint: must the data stay in EU/CH? If so, rule out Pinecone as cloud-only or assess its EU region plus TIA.
- 03Check filter needs: payload-indexed filters (client, date, confidentiality) are strong in Qdrant and Weaviate, weaker in Chroma and LanceDB.
- 04Mind the stack integration: Postgres already there -> pgvector with no second DB; Redis already as cache -> RediSearch; Elasticsearch already there -> use its kNN.
- 05Factor in team knowledge: no Docker/DevOps -> Pinecone Cloud or Qdrant Cloud; SQL comfort -> pgvector; Kubernetes experience -> Milvus or Vespa.
When each database fits
Anyone already running Postgres and expecting under 5M vectors per tenant should try pgvector first. Migrating to Qdrant later remains feasible if scaling demands it, but the entry cost is far lower with pgvector – no second database, no separate backups, same ACID guarantees.
Anyone starting without Postgres or needing strict multi-tenant separation is well served by Qdrant. One collection per client, payload-indexed filters on date and confidentiality, snapshots into Hetzner Storage Box. Setup costs one day; the system then runs stably for years.
Weaviate fits when GraphQL is the desired API style and multi-modal (text + image + audio in one collection) is needed. Milvus pays off only past 100M vectors or with GPU need – for a five-person fiduciary office with 500,000 documents, Milvus is overkill.
Chroma is fine for prototypes and Jupyter notebooks – quick setup, no cluster, productive in 10 minutes. Pinecone fits when the team has no DevOps capacity and accepts a third-country transfer; typical in US contexts or open-research use cases without PII.
Elasticsearch kNN is the right choice when hybrid keyword+vector in one query is needed and Elasticsearch already runs. Redis with RediSearch fits setups where sub-10 ms latency matters and data lives in Redis anyway – e.g. real-time recommendation. Vespa is the right choice for complex ranking pipelines with many signals (embedding + score + time decay); steep learning curve, more flexible result than the others.
LanceDB is the embedded option: no cluster management, runs in the same process as the application, fits local desktop tools or small on-prem installations without network visibility.
When a dedicated vector DB is unnecessary
For very small data volumes – under 10,000 entries – any dedicated vector DB is overkill. A SQLite table with the sqlite-vec plugin or a numpy array file with brute-force cosine search is enough and far simpler to run. Response time stays under 50 ms, code under 30 lines.
Equally unnecessary when the data fits inside an LLM context window. Modern models accept 200k to 2M tokens of context (as of May 2026: the current top Claude model with 1M, Gemini 2.5 Pro with 2M); a 30-page guideline fits entirely in the prompt. Anyone working with under 100,000 tokens needs no embedding and no retrieval – saving a day of pipeline work and removing a failure mode.
For pure full-text search without a semantic component, Meilisearch, Typesense, or Elasticsearch without kNN fit better. Anyone searching for "Müller" and wanting only "Müller" (not "Schmidt" because semantically related) gets better results from classical BM25 than from embedding search.
A vector DB is also a poor fit for update-heavy use cases. Vector indexes are optimised for "append plus search"; frequent updates on individual vectors (e.g. a user profile that changes constantly) force re-indexing and cost performance. In such cases a classical database with triggered re-embeddings is superior.
Trade-offs
STRENGTHS
- Semantic similarity rather than keyword-only search – relevant hits on fuzzy queries
- Scales to millions of entries with no prompt-limit issues
- Self-host options give full control over data location and licensing
- Open-source choice spans embedded (LanceDB) to cluster mode (Milvus)
WEAKNESSES
- A second database alongside Postgres adds operations and backup overhead
- Filter performance varies sharply – the wrong choice costs latency or recall
- Cloud options (Pinecone, Weaviate Cloud) scale non-linearly in price as data grows
- Switching the embedding model forces a re-index of all existing vectors
FAQ
Which vector DB is fastest?
In ANN benchmarks at 10M vectors, Qdrant, Milvus, and Vespa lead with p99 latencies around 10-20 ms. Pinecone Serverless sits similarly (cloud round-trip included). pgvector is 2-5x slower at comparable recall, but rarely noticeable since other pipeline steps (embedding, generation) dominate. Speed is rarely the deciding factor.
Can I switch vector DBs later?
Yes, the switch is straightforward but not free. Migrating 1M vectors from Pinecone to Qdrant typically takes half a day including verification: export as NDJSON, remap metadata onto Qdrant payload, upsert in batches, spot-check the top-k hits. What costs more than the migration: re-implementing all application-side filter and search calls that were tuned for the old platform.
Do I need GPU hardware for vector search?
For pure search load, CPUs are enough across all ten options. GPU only becomes relevant when you also run the embedding model locally (e.g. BGE-large on your own hardware) or the language model itself runs locally. Milvus can use GPU indexes (FAISS backend), worth it only at several hundred million vectors with high search QPS. For typical fiduciary setups: no GPU needed.
What does a production vector DB cost per month?
Self-hosted on Hetzner: typically CHF 30-80/month for the server including RAM and SSD up to 10M vectors. Qdrant, Weaviate, Milvus, pgvector, and Chroma are free as software. Pinecone Serverless from USD 0.30 per 1M storage operations plus compute – about USD 30-80/month for a five-person fiduciary with 200 queries/day. Weaviate Cloud Flex from USD 45/month, Weaviate Standard from USD 280/month. Pinecone and Weaviate Cloud get expensive once data or queries scale.
Related topics
Sources
- Qdrant documentation and benchmarks (HNSW, payload indexes, snapshots) · 2026-05
- Weaviate Cloud pricing – Flex, Standard, Premium tiers (October 2025 restructure) · 2026-05
- pgvector v0.8+ release notes – HNSW and IVFFlat indexes · 2026-04
- MarkTechPost – Best Vector Databases in 2026: Pricing, Scale Limits, Architecture Tradeoffs · 2026-05
- Pinecone Serverless pricing and EU region (eu-west-1) · 2026-05
- ANN-Benchmarks – open methodology for nearest-neighbour search · 2026-03