GOOGLE GEMINI · LLM PROVIDER

Google Gemini in Swiss practice: Vertex AI, Zurich region and data flow

Gemini 2.5 Pro/Flash/Flash-Lite via Vertex AI in europe-west6 (Zurich) or europe-west3 (Frankfurt). What that means for revDSG, price and model choice.

Researched & fact-checked by: DuneDive LLC · As of: 2026-05

What is Google Gemini?

Gemini is Google's LLM family. In May 2026 three production generations are available: Gemini 2.5 Pro (the flagship, 1M-token context, multimodal), Gemini 2.5 Flash (the workhorse with strong price-performance) and Gemini 2.5 Flash-Lite (the cheap high-volume variant). The newer Gemini 3.x generation is not yet unlocked in EU regions in May 2026, so 2.5 is the practical baseline for Swiss compliance.

For B2B usage there are two paths: first, the Gemini Developer API at ai.google.dev (consumer-style endpoint, quick to set up, but non-EU data flow by default); second, Vertex AI on Google Cloud Platform (enterprise path, with chosen region, IAM, audit logs, no training on customer data). For fiduciary and law firms in Switzerland only the Vertex AI path is realistically discussable.

Vertex AI offers two EU regions relevant to Swiss customers: europe-west3 (Frankfurt) and europe-west6 (Zurich). The latter is the only Google Cloud region in Switzerland and the natural choice when data must not leave the country. But not every Gemini model is available in every region. Gemini 2.5 Pro and 2.5 Flash run in europe-west3 and europe-west6, Flash-Lite as well. Pre-GA models and multimodal-live variants are often restricted to US regions or global-multi-region.

Why it matters

Three reasons why Gemini deserves a place in a Swiss stack: the Zurich region, the 1M context window and Flash-Lite pricing.

The europe-west6 region is the only argument Google has that the other US providers do not. An OpenAI or Anthropic call – even via EU datacentres – ends in a US-controlled legal space. Vertex AI in europe-west6 keeps the data physically in Switzerland; Google Switzerland GmbH is the Swiss contracting party; a DPA with Standard Contractual Clauses plus Swiss addendum is available. Not a free pass, but the clean variant for revDSG-sensitive cases.

The 1M-token context is the second point. A complete client file (contracts, correspondence, accounting) plus a year of transaction data fits in one Gemini 2.5 Pro prompt. Where you would otherwise build RAG or chop documents, Gemini can process the raw file. Not always cheap, but it saves setup work.

Third point: Flash-Lite is the cheapest premium model on the market. At USD 0.10 per 1M input and USD 0.40 per 1M output tokens (May 2026, ai.google.dev) it enables applications an OpenAI call would not justify: mass classification, lead scoring, stage-1 triage in client FAQ. Batch API drops that to USD 0.05/0.20.

How it works

The Vertex AI path looks like this: create a GCP project, enable the Vertex AI API, create a service account with roles/aiplatform.user, pin the region to europe-west6 or europe-west3. Calls go to europe-west6-aiplatform.googleapis.com (or via a LiteLLM gateway, which we recommend so model failover and logging land centrally).

Model call: Vertex AI takes the model as a fully qualified name, e.g. publishers/google/models/gemini-2.5-pro. Pricing is billed against the endpoint datacentre, not the caller origin. Gemini 2.5 Pro input per 1M tokens runs USD 1.25 (up to 200k) or USD 2.50 (above), output USD 10.00 (or USD 15.00). Gemini 2.5 Flash: USD 0.30 input / USD 2.50 output. Flash-Lite: USD 0.10 / USD 0.40. Batch API yields a 50% discount for async workloads – good for nightly document recognition or client reports.

Data retention: Vertex AI logs 30 days by default for abuse detection. On request (form, per project or per billing ID) Zero Data Retention can be enabled – then prompts and responses are not stored beyond response lifetime. Caching can be disabled separately. Training on customer data is contractually excluded, even before 2026. That clause only affected Google AI Studio / Gemini consumer apps, not Vertex AI.

Multimodal inputs (PDF, image, audio, video) go in directly. Gemini 2.5 Pro can read a 100-page PDF and extract specific tables. For document recognition that is often faster than a dedicated OCR stack.

Gemini onboarding for a Swiss SME

01Data classification: which data will flow through Gemini? Client name / receipt text / content? Map to confidentiality tiers.
02Open a GCP account through Google Switzerland GmbH, with a Swiss billing address and Swiss contracting party.
03Create a project with region lock on europe-west6. Organisation policy: resource.locations = europe-west6, europe-west3.
04Sign the DPA + Standard Contractual Clauses + Swiss addendum. For sensitive data: request the Zero Data Retention form.
05Minimal IAM: one service account per app, roles/aiplatform.user, no owner. Keys in Secret Manager.
06Model choice at the call layer: Gemini 2.5 Flash-Lite as default, Gemini 2.5 Pro for complex cases, routing via LiteLLM.
07Enable audit logging to Cloud Logging, Sentry integration for errors, Loki for latency tracking.

When to use Gemini

Gemini is the right choice when (a) data flow must stay in Switzerland or the EU, (b) you want to process very long documents in one call, or (c) you need high-volume classification on a tight budget.

Typical uses: client onboarding with full-file reading (2.5 Pro, 1M context, europe-west6), batch document recognition (Flash-Lite, USD 0.10 input, batch API), support-ticket triage (Flash-Lite with RAG), image/PDF analysis for tax assessments. Multimodal voice agents also work via Gemini-Live in Frankfurt – where the live API is EU-available.

Compared with the GPT/Claude competition: Gemini clearly wins on price and on context. Claude Opus is better at legal reasoning, the current top GPT model at creative writing. For 80% of daily fiduciary work Gemini 2.5 Flash is enough – and costs a fraction.

When not to use

Gemini is the wrong choice when you need a pure Swiss-sovereignty solution with no US provider in the chain. Google remains a US parent company, subject to the CLOUD Act. Vertex AI in europe-west6 is very good but not a Swiss host. If a mandate excludes that, the solution belongs on Mistral (EU provider), Swisscom Sovereign Cloud or a self-hosted Ollama instance.

Further cases: workloads needing creative writing or legally precise reasoning are better served by the current top Claude model. Tool-use / function-calling with complex logic is the current top GPT model territory. For specialised tasks (code review, math proofs) open-source models (Llama 4, the current DeepSeek-V generation) are often better.

Watch out for Gemini 3.x models: typically launched in US regions only. If you need 3.x you must wait or negotiate an EU-DPA exception – not a practical route for a 5-person fiduciary.

Gemini Developer API (ai.google.dev) without Vertex AI is not recommended for Swiss B2B: standard terms allow broader data use and the region is not selectable.

Trade-offs

STRENGTHS

Only LLM family with a Swiss cloud region (europe-west6, Zurich)
1M-token context: process full client files without RAG
Flash-Lite at USD 0.10/0.40 per 1M tokens – cheaper than most open-source hosters
Vertex AI: no training on customer data, Zero Data Retention on request
Multimodal native: PDF, image, audio directly in the call

WEAKNESSES

US parent: CLOUD Act remains a theoretical residual risk despite EU region
Gemini 3.x models not available in EU in May 2026 – version-jump risk
Reasoning quality lags Claude Opus for complex legal logic
Vertex AI onboarding is administratively heavier than an OpenAI key
Multi-region failover must be configured explicitly, otherwise single-region risk

FAQ

Does data really stay in Switzerland if I use europe-west6?

Inference processing and the default 30-day logging happen in the region, i.e. Zurich. What does not strictly stay in the region: aggregated telemetry, billing data, account metadata. Google is a US parent subject to the CLOUD Act – a US authority can theoretically demand disclosure. For revDSG purposes europe-west6 is very good; for absolute sovereignty you need Sovereign Cloud models or self-hosting.

Which Gemini model should I route as default?

Gemini 2.5 Flash. Pricing is USD 0.30 input / USD 2.50 output per 1M tokens, with the same 1M context window as Pro, and it covers 80% of daily office work. Pro is called only for complex reasoning or long legal analysis. Flash-Lite is for high-volume classification. LiteLLM rule: default Flash, escalate-on-low-confidence to Pro.

Can I use Gemini 3.x in Switzerland?

In May 2026, no. Google has announced a newer Gemini generation, but EU regions – including europe-west6 – are not served at launch. Pre-GA models run in US regions or global-multi-region. Anyone needing to stay in Switzerland sticks with Gemini 2.5 Pro/Flash/Flash-Lite for now. All three are production-ready and not pre-GA as of May 2026.

How does Gemini relate to the EU AI Act?

Gemini 2.5 Pro is classified as a general-purpose AI model with systemic risk (>10^25 FLOP training compute). Google delivers the required model cards, risk assessments and training-data summary called for in Art. 53 AI Act. For the deployer (= the Swiss SME using Vertex AI) that means: document in which process the model runs, which risk class the application has (usually limited risk) and which transparency notices clients must see.

Sources

Google Cloud Vertex AI – Generative AI Pricing (Gemini 2.5 Pro/Flash/Flash-Lite) · 2026-05
Vertex AI – Data Residency and Locations (europe-west6 Zurich, europe-west3 Frankfurt) · 2026-04
Gemini API – Pricing reference (Developer endpoint) · 2026-05
Vertex AI – Zero Data Retention Setup · 2026-03
GCP Model Availability – europe-west6 (Zurich) catalogue · 2026-05

FITS YOUR STACK?

What this looks like in your business – a 30-minute intro call.

Book a call