RUNPOD vs VAST.AI vs HETZNER GPU - DUEL

RunPod vs Vast.ai vs Hetzner GPU - where to train and infer in 2026?

Three GPU cloud models. RunPod as a serious on-demand platform, Vast.ai as a spot marketplace, Hetzner as EU-reserved hosting - price and profile comparison as of May 2026.

Researched & fact-checked by: DuneDive LLC · As of: 2026-05

What is the duel about?

Three paths lead to a GPU hour for LLM training or inference in May 2026. RunPod is a curated on-demand cloud from the US with data centres in Europe and Asia. Vast.ai is an auction marketplace where private individuals and small providers bid free GPU capacity. Hetzner is a German EU hoster that rents dedicated servers with GPUs as reserved hardware - no on-demand, no hourly call-off.

The three models differ fundamentally in price formation and availability. RunPod and Vast.ai are elastic: you book a GPU for an hour, a day or a week, pay per second, return it. Hetzner is static: you rent a server for at least a month, the GPU is permanently yours. That shifts the economics significantly: anyone who needs a GPU only sporadically flies with RunPod or Vast.ai. Anyone who hosts a model 24/7 comes out cheaper on Hetzner reserved pricing after three to five weeks compared to on-demand rates.

Important for Swiss fiduciary and law-office setups: only Hetzner holds the data unambiguously in EU/DE. RunPod does have EU regions (Sweden, Netherlands), but platform sovereignty sits in the US - relevant for strict DSG interpretation. Vast.ai auctions capacity globally; the location of a specific host is not reliably predictable.

Why the choice matters

Three hard axes decide which provider fits the concrete use case: price per GPU hour, availability and EU data protection.

Price per GPU hour as of May 2026: Vast.ai is clearly the cheapest. RTX 4090 around USD 0.31-0.40 per hour, A100 80 GB from USD 0.67 per hour when a high-reliability host is available. RunPod sits in the middle: Community Cloud RTX 4090 from USD 0.29 per hour, A100 80 GB from USD 1.39 per hour, H100 80 GB from USD 2.39 per hour. On the Secure Cloud tier (more reliable, firmer SLA) prices are roughly 30-40 percent higher: A100 80 GB from USD 1.89, H100 80 GB from USD 2.69. Hetzner plays in a different league: GPU servers with RTX 6000 Ada or L40S cost EUR 600-1500+ per month - cheap when amortised per hour, only with a one-month minimum binding.

Availability: RunPod delivers the highest consistency. Anyone wanting to reserve an A100 for next week gets one in most regions in May 2026 without delay. Vast.ai depends on market conditions - in high-demand periods the best prices vanish, some hosts go offline, the promised RAM is sometimes less than stated. Hetzner needs lead time: GPU servers are not always immediately available, typical wait one to four days.

EU data protection: Hetzner is the clean choice here. Servers in Falkenstein, Nuremberg or Helsinki, German provider, contractually manageable under GDPR/DSG. RunPod has EU regions, but as a US corporation Cloud Act risk persists - a TIA must document it. Vast.ai is data-protection open ground: the contract runs with Vast.ai (US), the physical hardware sits with a third host somewhere in the world. For sensitive client data in inference: unsuitable. For anonymous training on synthetic data: usable.

The three providers in detail

RunPod (US, San Francisco): on-demand cloud with two tiers. Community Cloud is cheaper and bundles capacity from partner-owned data centres - typically USD 0.29-1.39 per hour for RTX 4090 through A100 80 GB. Secure Cloud is premium, own data centres, higher availability guarantee, about 30-40 percent more expensive. Pods boot from a Docker image, GPU choice in a click, SSH or Jupyter immediately. Serverless endpoints for auto-scaling inference. Available GPUs in May 2026: RTX 4090, RTX 6000 Ada, L40S, A40, A100 40/80 GB, H100 80 GB, H200, AMD MI300X. Data centres in the US, Sweden, Netherlands, Singapore, India. Per-second billing, network traffic included in the price in most regions.

Vast.ai (US, San Francisco): marketplace model, no own data centre. Anyone with a spare GPU lists it on Vast.ai with their own price. Anyone needing a GPU searches the supply by GPU type, RAM, CPU, network bandwidth and host reliability score. Prices fluctuate strongly: RTX 4090 USD 0.31-0.50 per hour, A100 80 GB USD 0.67-1.20 per hour depending on market state. Advantages: significantly cheaper than RunPod, large capacity with flexible search. Disadvantages: host quality varies (slow disks, congested networks), no guaranteed SLA, some hosts kill pods during energy-price peaks. Best practice: pick reliability scores above 99 percent, take own snapshots to S3, hold no long-term state on the box.

Hetzner Dedicated GPU (Germany, Nuremberg/Helsinki/Falkenstein): reserved hardware, not a cloud construct. You rent a physical server with a fixed GPU for at least one month. Product line as of May 2026: GEX44 with RTX 6000 Ada (48 GB VRAM) from EUR 599/month, EX130-S with RTX 4090 from EUR 380/month (high demand, wait list), larger models with L40S or multi-GPU on request. Setup fee typically EUR 0-149 one-time. Contractually GDPR-compliant, German law, no third-country risk. Practically no GPU choice - the models are fixed, no hot-swap, no elastic scaling. If you need more GPU: order a second server.

GPU cloud selection in 6 steps

01Define the load profile: sporadic (training burst) = Vast.ai; spiky (auto-scaling inference) = RunPod Serverless; constant 24/7 = Hetzner.
02Check data sensitivity: client data = Hetzner mandatory; anonymised/synthetic = all three; public data = cheapest provider.
03Quantify GPU need: 24 GB VRAM (RTX 4090) is enough for 7B-13B models; 48 GB (RTX 6000 Ada / L40S) for 70B in 4-bit; 80+ GB (A100/H100) for unquantised 70B or multi-model.
04Estimate cost: divide the Hetzner monthly price by expected usage hours vs. RunPod hourly price times usage hours. Break-even at about three to five weeks of continuous load.
05PoC on RunPod Community Cloud: test for two days, validate the workload, then switch to the production provider.
06Pick the production provider: Hetzner for constant load and EU obligation; RunPod Secure Cloud for elastic production; Vast.ai only for non-sensitive training use.

Recommendation by scenario

Sporadic training or fine-tuning, 10-50 hours per month: Vast.ai. At USD 0.67 per hour for an A100 80 GB, a 30-hour training run costs around USD 20 - unbeatable by any other provider. The data must be anonymisable or synthetic; for client data the unclear host geography is unsuitable.

Production inference with auto-scaling, sometimes 0 sometimes 50 parallel requests: RunPod Serverless. Scales up and down on demand, pay per second even for GPU idle. Latency profile slightly higher than dedicated GPU (cold start 5-20 seconds), but cost on a spiky load profile is significantly better than 24/7 reserved hardware.

24/7 inference with constant load, fiduciary or law-office application, EU/DSG-compliant: Hetzner GEX44 or EX130-S. EUR 380-599 per month for a card with 24-48 GB VRAM, permanently available, hosted in Germany, German law. From roughly five weeks of continuous monthly operation, Hetzner is cheaper than RunPod Secure Cloud.

LLM training on own client data (rare, but it happens): Hetzner. Data does not leave the EU, the contract with the provider is GDPR-compliant. Training needs GPU hours - for a 7B model on 10 GB of training data two to five days on a single card - fits inside the monthly rental.

PoC with unclear duration, "test three days, then see": RunPod Community Cloud. Quickly spun up, quickly torn down, pay per second. Per-second billing forgives aborted attempts better than a Hetzner monthly contract.

Multi-GPU training, 8x H100 or more: RunPod Secure Cloud or hyperscaler-class providers. Hetzner has no 8x-H100 boxes in its standard programme as of May 2026; Vast.ai hosts with 8 GPUs on one node are rare and expensive.

When no GPU cloud fits

If load stays low (below 5 million tokens per month, occasional requests) and no DSG argument speaks against cloud APIs, OpenAI, Anthropic or Mistral API is simply cheaper than your own GPU. An A100 80 GB on RunPod Secure Cloud at 50 percent utilisation costs around USD 680 per month - that equals roughly 50-150 million the current top GPT model tokens.

If you need a GPU only for vector embedding (RAG pipeline without a local LLM), a CPU box suffices. OpenAI text-embedding-3-small costs about USD 0.02 per million tokens - cheaper than any GPU hour for the same volume. GPU pays off only when the language model itself is to run locally.

If you have no Linux sysadmin on the team and no external DevOps partner: Vast.ai and Hetzner require sysadmin work; RunPod is slightly friendlier thanks to prebuilt Docker templates. In that case a managed LLM cloud (Anthropic, OpenAI, Mistral La Plateforme) is more comfortable than own GPU operations.

If compliance audits want every block-storage description documented, Vast.ai is problematic - the concrete physical host changes, each change is potentially a new third-country transfer. Hetzner is clearly documentable here, RunPod acceptable with a fixed region.

Trade-offs

STRENGTHS

RunPod: curated on-demand cloud, per-second billing, serverless endpoints, broad GPU range from RTX 4090 to H200
Vast.ai: cheapest market, RTX 4090 from USD 0.31/h, A100 80 GB from USD 0.67/h, huge capacity when reliability score >99
Hetzner: EU GDPR-compliant, German law, permanent availability, cheaper than RunPod Secure from about 3 weeks of 24/7 operation
All three: no hyperscaler lock-in, OSS-friendly, no minimum contracts on RunPod/Vast.ai

WEAKNESSES

RunPod: US corporation, Cloud Act risk on strict DSG interpretation, EU regions exist but not legally decoupled
Vast.ai: host geography uncertain, host quality varies, some hosts kill pods under market pressure, unsuitable for client data
Hetzner: no on-demand, minimum 1 month, GPU range limited (no H100 in standard programme), availability not always immediate
All three: GPU ops require Linux sysadmin work, no "click solution" like a managed LLM API

FAQ

How much does 24/7 Llama 3.3 70B inference cost as of May 2026?

On an A100 80 GB at RunPod Secure Cloud (USD 1.89/h x 720 h) about USD 1360 per month. On an RTX 4090 in 4-bit quantisation via RunPod Community Cloud (USD 0.29-0.40/h) about USD 210-290 per month - with availability risk. On a Hetzner GEX44 (RTX 6000 Ada, 48 GB) a fixed EUR 599/month - roughly USD 670, with full EU GDPR compliance. Rule of thumb: Hetzner is cheaper than RunPod Secure from about three weeks of continuous monthly operation.

Is Vast.ai suitable for client data?

No, not without anonymisation. Vast.ai is a marketplace with hosts worldwide - the concrete location of a specific server is not reliably predictable. Unsuitable for DSG-compliant inference on client data. Usable for training on anonymised or synthetic data, with pseudonymisation and host-reputation checks.

What differs between RunPod Community and Secure Cloud?

Community Cloud bundles capacity from partner-owned data centres - cheaper but with larger variation in availability and performance. Secure Cloud is RunPod-owned data centres with higher SLAs - typically 30-40 percent more expensive. For serious production: Secure Cloud. For PoCs, research, training bursts: Community Cloud is enough.

What H100 options exist as of May 2026?

RunPod Secure Cloud offers H100 80 GB from USD 2.69 per hour; H200 available in selected regions. Vast.ai hosts with H100 are available, mostly around USD 2.00-3.50 per hour - no dramatic edge over RunPod, in exchange for less SLA. Hetzner does not list H100 in the standard programme as of May 2026; available on request with a few weeks of lead time, price individual.

Sources

RunPod - GPU pricing page · 2026-05
Vast.ai - live GPU marketplace pricing · 2026-05
Hetzner - dedicated GPU server lineup · 2026-05
Spheron - GPU cloud pricing comparison 2026 · 2026-04

FITS YOUR STACK?

Planning GPU hosting for LLM inference under DSG obligation? We build the setup on Hetzner including Ollama/vLLM stack in 5-10 days.

Book a call