fairlane.systems

APERTUS · COMPLIANCE

Apertus: the open Swiss AI model from ETH Zurich, EPFL and CSCS – status May 2026

Apertus 8B and 70B, Apache 2.0, from ETH/EPFL/CSCS. Released 2 Sep 2025, 15T tokens, 1000+ languages including Swiss German and Romansh. As of May 2026: production experience and Apertus 2 outlook.

Researched & fact-checked by: · As of: 2026-05

What is Apertus?

Apertus is the first fully open large language model from Switzerland. It was jointly developed by ETH Zurich and EPFL, with compute infrastructure from the Centro Svizzero di Calcolo Scientifico CSCS in Lugano. The name "Apertus" is Latin for "open" and reflects the publication philosophy: model weights, training data description, training code and evaluation data are released under Apache 2.0.

The first release wave took place on 2 September 2025 with two variants: Apertus-8B (8 billion parameters, for edge and CPU inference, around 16 GB VRAM in fp16) and Apertus-70B (70 billion parameters, for cloud inference, around 140 GB VRAM in fp16). The models were trained on around 15 trillion tokens with a substantial share of European languages (German, French, Italian, English) plus 1,000+ further languages. Swiss German (Bernese, Zurich, Wallis, Grisons dialects) and Romansh (Sursilvan, Surmiran, Puter, Vallader, Rumantsch Grischun) are explicitly in the training corpus.

Apertus is an "open-weight" model – weights are freely downloadable (HuggingFace, swiss-ai/Apertus-70B and swiss-ai/Apertus-8B). Plus an officially Swisscom-operated API hosting for commercial users using CSCS compute capacity. The ETH AI Center and the EPFL AI Center are the academic anchors as of May 2026; CSCS provides the training compute (Alps supercomputer with over 10,000 NVIDIA H100/H200/B200 GPUs).

Why Apertus is strategically relevant for Switzerland

Four reasons make Apertus more than just another open-weight model.

First: legal sovereignty. Training data selection and model behaviour are subject to Swiss and EU law – in particular the FADP, the EU AI Act, and ETH-internal data protection standards. Unlike US models (GPT, Claude) or Chinese models (DeepSeek, Qwen), there is no extraterritorial access claim by foreign authorities to the model or training data. For clients with high sovereignty needs (professional secrecy, FINMA exposure), this is an argument.

Second: Swiss language coverage. As detailed in the article "AI and Swiss multilingualism": Apertus 70B is, as of May 2026, the only frontier model with productively usable Romansh capability and with correction recognition in Swiss Standard German (helvetisms, cantonal proper nouns). Mistral Large 2 is on par for FR/DE, but RM is Apertus territory.

Third: openness of training data and reproducibility. Unlike most open-weight models (Llama, Mistral, DeepSeek), ETH/EPFL fully publish training data sources and the training setup. This is relevant for FINMA SN 08/2024 Pillar 3 (robustness, model validation) and for EU AI Act Art. 50 (transparency for general-purpose AI). As of May 2026 Apertus counts as the world's best-documented frontier model – attractive also for US and EU audits.

Fourth: hosting options in Switzerland. Apertus runs via Swisscom API with Swiss data residency, via Infomaniak GPU instances, or fully on premises on own GPU hardware (2x H100 80GB minimum for 70B inference). For clients with strict data classification, this is fully sovereign – something not possible with US models.

Apertus in practice May 2026

Model architecture. Apertus is a transformer decoder very similar to Llama 3, with grouped-query attention and rotary position embeddings. Context window: 128k tokens in both variants. Vocabulary: around 256k tokens, optimised for European languages plus Romansh. Training architecture: 15T tokens in the first stage, followed by supervised fine-tuning (SFT) and direct preference optimisation (DPO) for chat capability. RLHF components from ETH-internal annotator teams.

Benchmarks May 2026. On MMLU (English, for comparability), Apertus-70B achieves around 78-82 points (per the official model card Sep 2025), comparable to Llama-3 70B and below the current top GPT model/the current top Claude model. On MGSM (multilingual math), Apertus is strong in DE/FR/IT, top in RM. On SwissLegalBench (Swiss law): Apertus-70B sits ahead of Llama-3 70B and near Claude 3.5 Sonnet (status 2025) but below the current top Claude model (status May 2026).

Hosting May 2026. Three productive paths.

*Path 1: Swisscom API.* Commercial API with Swiss data residency. Price not publicly published as of May 2026; pilot clients report a corridor of CHF 0.4-1.5 per 1M tokens for 70B. CSCS compute, scalable token quotas. Integration via LiteLLM or direct HTTP API.

*Path 2: Infomaniak GPU instances with Apertus container.* Self-hosting on rented H100/L40S instances. Medium effort: container provisioning, vLLM or TGI as inference server, monitoring. Cost roughly CHF 6,000-12,000/month for a 70B setup with 2x H100.

*Path 3: on-premises on own GPU.* Full sovereignty. Hardware investment CHF 80,000-150,000 for 2x H100 80GB plus server, with additional power, cooling and maintenance costs. Pays off from a workload profile of around 50M+ tokens/month or with highly sensitive clients (criminal defence, hospital, wealth advisory).

Apertus 2 (in development as of May 2026). ETH and EPFL communicated an Apertus 2 roadmap in February 2026. Expected: expanded multimodality (vision/audio), a dedicated Apertus Voice variant with CH-DE dialect capability, an Apertus Code variant for programming, possibly an Apertus MoE (mixture of experts) as a 200B/active-30B model. Official release dates as of May 2026 not known – speculation ranges from Q4 2026 to Q2 2027. The CSCS API service stays productive and will expand with the Apertus 2 release.

Deploy Apertus in a Swiss strategy – 6 steps

  1. 01Data classification: what share of requests require sovereignty (client data, professional secrecy, FINMA exposure)?
  2. 02Volume estimation: how many tokens per month? Below 5M, the Swisscom API suffices; above 50M, on-premises pays off.
  3. 03Choose hosting path: Swisscom API (fast), Infomaniak (medium), on-premises (sovereign, costly).
  4. 04Routing setup: LiteLLM or Portkey with rules "sensitive → Apertus, FR → Mistral, hard reasoning → Claude".
  5. 05Benchmark against the use case: 50-100 real client requests per Apertus vs Claude vs Mistral, measure hit rate.
  6. 06Periodic re-test every 6 months, check the Apertus 2 roadmap, adjust routing rules.

When Apertus is the right choice

Four configurations clearly favour Apertus.

First: clients under professional secrecy (law firms, doctors, fiduciary offices with wealth mandates, notaries). Full data-location control in Switzerland and avoidance of extraterritorial access risks are mandatory. Apertus on-premises or Swisscom API with Swiss data residency fulfils that.

Second: Swiss language mix with RM or CH-DE component. Fiduciary offices in Grisons, advisory in the Engadin region, Ticino mandates with local dialect contact, Bernese client triage with Schwizerdütsch emails. Apertus is productively stronger than US competition here.

Third: compliance-driven setups with audit requirements. FINMA SN 08/2024 Pillar 3 (model validation) and EU AI Act Art. 50 (transparency for GPAI) require model documentation. Apertus delivers it out of the box – model card, training data description, evaluation suite freely available.

Fourth: consolidated LLM strategy with primary CH/EU hosting. A multi-provider strategy (Apertus + Mistral + Claude fallback) is the most frequent configuration at Swiss consulting boutiques in May 2026 – Apertus covers the sovereign core, Mistral covers FR top and EU hosting, Claude remains as a top frontier fallback for difficult cases.

When other models are better

Three patterns argue against Apertus as the first choice.

Complex reasoning cases needing top frontier. For very hard reasoning (math olympiad level, complex legal argument, code-generation-heavy tasks), Apertus-70B in May 2026 noticeably trails the current top Claude model and the current top GPT model. If such cases dominate, the current top Claude model or the current top GPT model as the primary model.

Maximum efficiency on a small budget. Apertus self-hosting with GPU pays off only at significant volume. Below 5M tokens/month, a small OpenAI model (e.g. GPT-mini class) or Claude Haiku is markedly cheaper than a 2x H100 setup. The sovereignty question must be weighed here.

Latency-critical production applications. Top frontier cloud models (Claude, GPT, Gemini) have optimised inference pipelines with TTFT < 200ms. Apertus on 2x H100 typically reaches 300-600ms TTFT, depending on setup. For real-time chat UX the difference can be noticeable.

Multi-provider as best practice. Apertus is not "either Apertus or cloud model" but part of a multi-provider strategy. A routing layer (LiteLLM, OpenRouter, Portkey) dispatches by request type: highly sensitive client data to Apertus, FR-specific to Mistral, top reasoning to Claude, generic to cheaper models.

Trade-offs

STRENGTHS

  • Full Swiss sovereignty via on-premises or Swisscom API with Swiss data residency
  • Best RM and CH-DE dialect capability of all frontier models in May 2026
  • Apache 2.0 allows fine-tuning, modification, commercial use
  • Transparent training data eases FINMA SN 08/2024 and EU AI Act audits

WEAKNESSES

  • Reasoning quality on MMLU/hard math trails the current top Claude model and the current top GPT model
  • On-premises hosting requires 2x H100 80GB – hardware investment CHF 80-150k
  • Apertus Voice for CH-DE audio not yet productive (expected Q4 2026 - Q1 2027)
  • Latency on 2x H100 typically 300-600ms TTFT – often weaker than cloud models for real-time chat

FAQ

What does Apertus cost via the Swisscom API?

As of May 2026, prices are not published on a standard pricing page. Pilot clients report a corridor of CHF 0.40-1.50 per 1M tokens for Apertus-70B, depending on volume commitment and SLA. Apertus-8B sits well below (typically CHF 0.05-0.20 per 1M tokens). Comparison May 2026: the current top GPT model via Azure Switzerland North roughly CHF 4-12 per 1M tokens, the current top Claude model roughly CHF 5-25, Mistral Large 2 roughly CHF 2-8. Apertus is thus the cheapest frontier path with full Swiss sovereignty.

Can I fine-tune Apertus?

Yes. The Apache 2.0 licence permits modification and redistribution. Fine-tuning typically uses LoRA or QLoRA on 1-4x H100 80GB. Training data must be prepared in a privacy-compliant way (FADP Art. 6 purpose limitation – client data without clear consent is problematic). Practice in May 2026: boutiques such as LatticeFlow, Inspire AI Schweiz, ETH spin-offs offer fine-tuning services for Swiss SMEs with a clear data-protection pipeline.

When does Apertus 2 arrive?

Official release date not communicated as of May 2026. ETH/EPFL published a February 2026 roadmap with components Apertus Voice (CH-DE dialect capability), Apertus Code (programming), Apertus Vision (multimodal), possibly Apertus MoE (200B/active-30B). Reasonable speculation: Voice and Code first (Q4 2026 - Q1 2027), Vision and MoE later (Q2-Q4 2027). Until then Apertus-8B and 70B are the productive versions.

Is Apertus GDPR-compliant?

The model itself is built compatibly with GDPR – training data selection and model behaviour with EU law awareness. That does not make Apertus automatically compliant; processing of personal data depends on the user's setup (legal basis, DPA with hosting provider, deletion duties, access duties). Advantage: training data are transparently documented, which makes access requests under FADP Art. 25 / GDPR Art. 15 easier than with closed-weight models.

Related topics

CH MULTILINGUALISM · COMPLIANCEAI and Swiss multilingualism: LLMs for German, French, Italian and RomanshSWISS CLOUD · COMPLIANCESovereign Swiss cloud hosting: Infomaniak, Exoscale, Swisscom, Safe Swiss Cloud, Hostpoint, Cloudsigma comparedOPEN-WEIGHT MODELS - COMPARISONOpen-weight models compared: Llama 3.3/4, Mistral, DeepSeek, Qwen, Gemma, Phi-4, Command R, Falcon, GLM, ApertusMISTRAL · LLM PROVIDERMistral AI from a Swiss fiduciary perspective: EU residency, pricing, sovereigntyROUTING · AI CONCEPTMulti-LLM routing: which model when, for how much

Sources

  1. Apertus Modell-Card ETH Zurich/EPFL/CSCS (HuggingFace) · 2025-09
  2. ETH AI Center – Apertus Pressemeldung · 2025-09
  3. CSCS – Centro Svizzero di Calcolo Scientifico, Alps-Supercomputer · 2026-05
  4. Swisscom – Apertus API for Business · 2026-05
  5. Public AI Network – Apertus Public Compute · 2026-05
  6. EPFL AI Center – Apertus Research Page · 2026-05

FITS YOUR STACK?

This is not legal advice. We evaluate Apertus for your client workloads, deploy on-premises or Swisscom API hosting and build multi-provider routing across Apertus, Mistral and Claude. Initial call free of charge.

Book a call