fairlane.systems

ENERGY & CO2 · TREND 2026

AI energy and CO2 trend 2026: what a query actually consumes and where Switzerland stands

May 2026: 0.3 Wh per GPT-4 query, 500t CO2 for GPT-4 training, data centres up 35% YoY. Swiss advantage: hydropower and waste-heat use.

Researched & fact-checked by: · As of: 2026-05

What does AI energy use mean in May 2026?

AI energy discussion separates two phases that draw power differently.

Training: a one-time process over weeks or months. GPT-4 training (2022-2023) is estimated at about 50 GWh and roughly 500 tonnes of CO2-equivalent (source: Hugging Face Sustainable AI Lab, standard US grid mix assumption). Llama 4 Maverick (April 2025, over 30 trillion training tokens) per Meta's own disclosure sits at about 1500 tonnes CO2-equivalent, offset via Meta's own renewable PPAs. Gemini 2.5 and the current top Claude model do not publish official figures – estimates land at 200-800 tonnes CO2.

Inference: ongoing per-request consumption. Studies in May 2026 (Sustainable AI, EPFL May 2025; Cottier et al., arXiv April 2026) converge on: - GPT-4 class per standard query: 0.2-0.4 Wh (median 0.3 Wh). - GPT-4 class per reasoning query (o3, Extended Thinking): 1.5-5 Wh. - GPT-4o-mini class: 0.05-0.15 Wh per query. - Local 8B model on Apple M-chip: 0.005-0.02 Wh per query.

Comparison Google search: 0.0003 Wh per search query (Google self-disclosure 2009, not updated by Google in 2024). A GPT-4 query thus consumes about 1000 times the power of a classical search. What matters in practice is not the per-unit numbers but the aggregate – and the data-centre boom of 2024-2026 has visibly shifted that.

Why it matters in 2026

Three developments raise pressure.

First, data-centre boom: per the IEA "Electricity 2026" report (January 2026) global data-centre electricity use grows from about 460 TWh (2022) to projected over 1000 TWh by 2026, with AI as primary driver. US and Asia-Pacific up 35% YoY in 2025-2026. In Ireland data centres already accounted for 21% of national electricity in 2024; in Virginia (US) over 25% on peak days. McKinsey estimates (March 2026) global investment need for AI data centres at USD 5.2 trillion by 2030.

Second, water and cooling cost: data centres use water for evaporative cooling. Microsoft's 2024 Sustainability Report shows a 24% water-use increase driven by AI workloads. In water-stressed regions (Spain, Arizona, Middle East) this turns into conflict. Cooling accounts for 40-50% of total electricity use in modern hyperscale data centres – efficient cooling (PUE below 1.2) saves almost half.

Third, regulatory reporting: EU CSRD (Corporate Sustainability Reporting Directive, from 2025 for large firms, from 2027 for SMEs) requires Scope 3 CO2 reporting – AI cloud use falls in. Swiss firms with EU subsidiaries report too. Anthropic, Microsoft Azure and Google Cloud provide per-user carbon reports in May 2026 – those required to report get the data through the respective dashboard.

How it works

AI energy use can be measured on three levels.

Per query (inference): tokens × model size × hardware efficiency. A GPT-4o query with 500 output tokens on an Nvidia H100 (700 W at full load) physically draws about 0.15-0.30 Wh – the exact number depends on batching, quantisation and cooling. Anthropic published a pilot study with the University of Lausanne in April 2026: Claude Sonnet average 0.25 Wh per query without Extended Thinking, 1.8 Wh with.

Per training run: GPU hours × power × cooling overhead. Per Meta, Llama 4 was trained on 32000 H100 GPUs over about 25 days = roughly 19 million GPU-hours × 0.7 kW × 1.2 PUE ≈ 16 GWh. At the US grid factor of 0.4 kg CO2/kWh: about 6400 tonnes CO2eq. Meta's self-reported 1500 tonnes assumes a 75-80% renewable share.

Per data centre: PUE (Power Usage Effectiveness) = total power / IT power. Best data centres reach PUE 1.1 (10% overhead for cooling), average 1.5-1.8. It follows: the model alone does not determine the footprint, but where it runs does. A model on a PUE-1.1 data centre with 100% hydropower produces 3-5x less CO2 than the same model on a PUE-1.8 data centre with coal power.

Swiss advantages: - Swiss grid mix: about 60% hydropower, 30% nuclear, 10% PV/wind. CO2 factor: 0.04 kg/kWh, 10x below the US average (0.4 kg/kWh) and 5x below DE (0.2 kg/kWh after the Energiewende). - Climate: cooler climate cuts cooling needs. Swiss data centres run free-cooling 8-10 months per year. - Heat reuse: Infomaniak (CH) has been heating 6000 flats since 2022 via waste heat from its Geneva data centre. Green Datacenter (Lupfig) feeds the Zurich district heating network.

How to track and adopt this trend in 5 steps

  1. 01Market watch: annual review of sustainability reports from OpenAI, Anthropic, Google Cloud, Microsoft Azure. Track IEA reports on data-centre growth. Swiss statistics (BFE, Asut) annually.
  2. 02AI usage inventory: per AI tool record estimated monthly request counts and model class. Use median values (standard 0.3 Wh, reasoning 2 Wh, edge 0.01 Wh) to project CO2 footprint.
  3. 03Pilot optimisation: review model class per use case – where GPT-4o-mini or Claude Haiku is enough instead of the big models. Check whether the batch API fits asynchronous workloads.
  4. 04Location strategy: for self-hosting or reseller APIs explicitly pick an EU or CH region (Anthropic via Vertex Zurich, Hetzner Falkenstein DE, Infomaniak Geneva). Document.
  5. 05Reporting: once per year or per client report add a CO2 line for AI use – with the source of the median values and the chosen grid mix factor. Professional practice even without CSRD obligation.

When CO2 optimisation pays off

CO2 optimisation pays off in four configurations.

First, at high request volume: from about 10000 queries per month the difference between carbon-light and carbon-heavy inference becomes noticeable. Example: 10000 GPT-4o queries per month × 0.3 Wh = 3 kWh electricity. On US mix: 1.2 kg CO2. On Swiss mix: 0.12 kg CO2. Small, but over a year that is 14.4 kg vs 1.4 kg.

Second, in self-hosting: when you host Llama 4 or Mistral yourself, you control the grid mix directly. Hetzner Falkenstein DE runs on wind+PV PPA, Infomaniak Switzerland on hydropower. Cloud API providers are less transparent – OpenAI offers no location choice.

Third, under CSRD reporting: from 2027 for SMEs above EUR 50 million revenue or 250 staff in the EU. Swiss SMEs with EU subsidiaries report too. Here AI use CO2 must be documented in Scope 3.

Fourth, as a sales argument: customers in banking, insurance and pharma increasingly demand CO2 evidence from suppliers. Documented Swiss hydropower hosting wins in the pitch.

Concrete optimisations in May 2026: - Pick a smaller model when possible (GPT-4o-mini vs 4o, Claude Haiku vs Sonnet) – factor 3-5 energy saving on many tasks. - Use caching (Anthropic Prompt Caching, OpenAI Cached Input) – repeated prompts cost less power. - Use batch APIs (OpenAI Batch, Anthropic Batch): bundle work into 24-hour windows, hardware runs more efficiently (50% price discount, 30-50% less energy). - Choose location deliberately: Anthropic via Google Cloud europe-west6 (Zurich) instead of us-central1. - Use reasoning sparingly: o3 only where it truly adds value.

When CO2 optimisation is not a priority

CO2 optimisation should not come at the cost of other goals.

Low request volume: below 1000 queries per month the difference between grid mix options is under 1 kg CO2 per year. Time is better spent on quality and compliance.

Quality over energy: if the cheaper small model produces more errors that force a human to redo work, the CO2 benefit is quickly eaten up. A human workday on an office PC and heating consumes 5-10 kWh – equivalent to 20000-30000 GPT-4o queries.

Greenwashing trap: some vendors market "100% renewable" via Renewable Energy Certificates (RECs) without actual temporal or local coverage. Princeton research (February 2026) shows that only 24/7 matched PPAs (Power Purchase Agreements) actually bring CO2 to zero. Microsoft and Google have 24/7 match targets for 2030. OpseAI publishes no such data.

False comparisons: "a GPT-4 query equals 30 smartphone charges" is a headline based on top-end estimates. Realistically a query equals 1-2 smartphone charges (median 0.3 Wh, smartphone battery 10-15 Wh). Stakeholder communication should use solid median numbers, not worst-case estimates.

Avoid marketing pitfalls: no "climate-neutral AI" claim without a clear source. Third-party carbon offsets remain contested in May 2026 – inference covered by a renewable PPA is the more honest baseline.

Trade-offs

STRENGTHS

  • Swiss grid mix (60% hydropower, CO2 factor 0.04 kg/kWh) provides a clear location advantage
  • Batch APIs and prompt caching lower both power and token use in parallel
  • Edge models (Apple Intelligence, Phi-4-mini) cut power need by factor 20-30
  • CSRD reporting from 2027 motivates documented optimisation

WEAKNESSES

  • Frontier models and reasoning mode draw 5-10x more energy
  • Vendors disclose carbon data with very different depth
  • Carbon offsets remain contested – high greenwashing risk
  • Cooling water becomes a conflict point in dry regions

FAQ

How much CO2 does my fiduciary AI use produce per year?

Rule of thumb: 1000 GPT-4 queries per month on Swiss grid mix = 0.15 kg CO2 per year. 10000/month = 1.5 kg per year. Even at 100000/month (very large firm): 15 kg CO2 per year. For comparison: a Zurich-Berlin flight is about 200 kg CO2 per person. AI inference is a negligible line for an SME compared to office heating, commuting and business travel.

What about training CO2?

Training is a one-time investment by the model provider, not the user footprint. Spread across hundreds of millions of queries per model the training share per individual query is minimal (< 5% of inference draw). Llama 4 users formally carry no training CO2 – Meta already paid that. The current top Claude model users indirectly contribute pro-rata through the subscription, but technically very small.

Does an edge model lower the CO2 balance?

Yes, clearly. Apple Intelligence on an M-Mac uses about 0.01-0.02 Wh per query – factor 20-30 below GPT-4o. But: only meaningful if the local model is good enough. If reduced quality forces rework, the benefit disappears.

Should I buy carbon offsets for my AI use?

In May 2026 most sustainability advisors discourage it. Carbon offsets have come under heavy criticism in the past two years (Verra scandal 2023, Berliner Tagesspiegel investigation 2025). Recommendation: first reduce (smaller model, CH/EU region, batch API), then pick providers with real 24/7 matched PPAs (Microsoft, Google), and only offset as a last step with high-quality direct-air-capture certificates (Climeworks, Heirloom).

Related topics

HETZNER · TECHHetzner as EU hosting for Swiss fiduciaries and SMEs: data centres, contracts, costHETZNER · INFOMANIAK · EXOSCALE · DUELHetzner vs Infomaniak vs Exoscale – where does a Swiss fiduciary host its AI?SOVEREIGN HOSTING - COMPARISONSovereign hosting compared: Hetzner, Infomaniak, Exoscale, OVHcloud, Scaleway, Swisscom, Safe Swiss Cloud, netcup, Contabo, on-premSELF-HOSTED VS. CLOUD · AI CONCEPTSelf-hosted vs. cloud LLM: a decision framework for SMEs and fiduciariesTOKEN PRICING · COSTSToken costs explained: input, output, cache, provider comparison May 2026EDGE AI · TREND 2026Edge AI trend 2026: on-device models for phone, laptop and client appREASONING · TREND 2026Reasoning model trend 2026: o3, R1, Extended Thinking and the test-time-compute boom

Sources

  1. IEA – Electricity 2026 report · 2026-01
  2. Cottier et al. – The energy footprint of generative AI inference (arXiv) · 2026-04
  3. Anthropic / EPFL – Pilot study on das aktuelle Claude-Spitzenmodell inference energy · 2026-04
  4. Infomaniak – Datacenter waste heat reuse for district heating · 2025-09
  5. Microsoft – Environmental Sustainability Report 2024 · 2024-08

FITS YOUR STACK?

What this looks like in your business – a 30-minute intro call.

Book a call