fairlane.systems

DISASTER RECOVERY · SECURITY

Disaster recovery, RTO and RPO: what an SME fiduciary really must keep ready

How long may the outage last, how much data may be lost? Four DR strategies with May-2026 pricing and tooling.

Researched & fact-checked by: · As of: 2026-05

What are RTO and RPO?

RTO and RPO are the two metrics with which every disaster-recovery plan starts. The Recovery Time Objective describes the maximum accepted outage duration from incident to functional recovery. The Recovery Point Objective describes the maximum accepted data loss, measured as the time gap between the last usable backup and the moment of incident.

An example makes it tangible. RTO 4 hours, RPO 1 hour means: if the fiduciary system fails at 09:15, it must be running again no later than 13:15, and the restored data may at most reflect the state of 08:15 – anything booked between 08:15 and 09:15 may be lost. These values are set not by gut feel but by business impact: how much revenue or payroll cost is lost per hour of outage? How much manual rework does one hour of lost movement data cost?

The values must be written down in the IT emergency plan. Without documented RTO and RPO a DR strategy is not verifiable – and in a claim case management is left having to explain itself, because without metrics it cannot show it fulfilled its duty of care.

Why it matters

May 2026 shows a series of examples why DR planning is not abstract. The SwissSign CA outage in 2024 triggered a chain of follow-on disruptions because many Swiss SMEs sourced certificates via SwissSign and had no failover CA – business mail, login portals and VPN access went down at the same time. The Crowdstrike Falcon incident in July 2024 sent thousands of Windows servers worldwide into boot loops simultaneously. Both cases were not cyber attacks but configuration or update errors of large vendors – and both showed that the question is not "if" but "when" and "how fast restored".

For a fiduciary office the deadline angle comes on top. Dunning deadline, VAT quarter, AHV payroll filing are fixed dates. Anyone waking up on 30 April after the VAT quarter with a corrupt system has a sanction risk with the federal tax authority that has nothing to do with IT – only with a missed deadline. RPO 1 hour is not enough here; you need a concept that secures the deadline.

EU AI Act angle: anyone running an AI system with high risk must under Art. 17 document a risk management system. Disaster recovery is a mandatory component. Anyone lacking it cannot put the system into production.

Four DR strategies with cost and recovery time

There are four tiered strategies, sorted by speed and cost.

Backup restore (4 to 24 hours): the simplest variant. After an outage, a new server is provisioned (Hetzner Cloud in about 5 minutes, Hetzner Dedicated in 4 to 24 hours), the last backup played back and operations resumed. Cost: 1x productive server, because no standby runs. Realistic RTO 6 to 12 hours including provisioning, RPO 1 to 24 hours depending on backup frequency. Fits small SMEs whose one-day-outage business loss is below a few thousand francs.

Pilot light (1 to 4 hours): a powered-off or minimal standby server stands ready with installed software and current configuration. On incident it is started, the last backup is played back, DNS is switched. Cost: 1.5x productive cost because the standby holds at least storage and IP. RTO 1 to 4 hours, RPO depends on backup frequency.

Warm standby (15 minutes to 1 hour): a full second server runs, with asynchronous replication of the database and file volumes. DNS failover via Cloudflare or Route53 makes the switch in minutes. Cost: 2.5x productive cost. RTO 15 to 60 minutes, RPO 1 to 15 minutes. Fits fiduciary and law firms that cannot accept a day of outage.

Hot standby (seconds to minutes): synchronous multi-master or active-active setup with a load balancer in front. On node failure the second node takes over without data loss. Cost: 3x to 5x productive cost plus running operational complexity. RTO seconds, RPO zero. Worth it only at revenues where every outage minute causes four-digit damage – that is, typically not at the 5-person SME but at banks, high-volume web shops, critical infrastructure.

Important: the cost factors are not just server rentals. Hot standby needs personnel experienced in synchronous replication, split-brain detection, quorum setup – that costs typically CHF 30,000 to CHF 80,000 per year on top. An SME moving from backup-restore to warm-standby gains more than one moving from warm-standby to hot-standby.

Tools May 2026: Hetzner Snapshots are usable for pilot light. Veeam and Acronis are the standard packages for Windows-heavy SMEs with GUI operation. Local rsync or Borg replication to a second physical server in another data centre is the lean Linux variant. For databases: PostgreSQL streaming replication, MariaDB Galera Cluster, or ProxySQL for MySQL.

DR planning in 5 steps

  1. 01Business impact analysis: which systems are critical, which deadlines apply, what does an hour of outage cost? Quantify a CHF-per-hour value per system.
  2. 02Set RTO and RPO: per system write two numbers into the IT emergency document. Values must be compatible with the hosting setup.
  3. 03Choose DR strategy: backup-restore, pilot light, warm standby or hot standby by cost-benefit. One choice per system, not one for all.
  4. 04Implementation: install tooling (Hetzner Snapshots, Veeam, PostgreSQL streaming replication, Borg), configure DNS failover, write runbook.
  5. 05Annual exercise: once a year a full restore into an isolated environment, measure RTO and RPO, fold deviations back into the concept.

Which strategy for which business

For the typical 3 to 10-person fiduciary or law-firm SME, warm standby with RTO 1 hour and RPO 15 minutes is the economically best choice. The extra cost over pure backup-restore is about CHF 100 to CHF 250 per month (second Hetzner server in another data centre plus replication software). The business value is high: on VAT deadlines and payroll days a one-hour outage can cost several thousand francs of rework.

For pure marketing websites without booking logic, for learning platforms without deadlines, for internal wiki systems, backup-restore with RTO 12 hours, RPO 24 hours is enough. Here simplicity wins and one day of outage is not business-critical.

For banks, high-frequency e-commerce shops and critical infrastructure, hot standby is mandatory. But that is not the typical customer of an SME service provider and not a setup that runs on the side – it is its own professional field with its own job description.

The most important recommendation: one full-restore exercise per year. Not "it is in the concept" but "we did it on 15 May 2026, here is the protocol, RTO was beaten with 3 hours 12 minutes". Experience shows this exercise uncovers five to ten unknown stumbling blocks that would cost hours in a real incident.

Where effort and benefit do not match

Hot standby for a 3-person fiduciary office is over-invested. Annual extra costs of CHF 30,000 to CHF 80,000 are out of proportion to the avoided outage damage. Anyone selling that sells insurance the client does not need – and that is not good business because the client eventually notices the over-pay.

Also problematic: DR plans that are never exercised. A 40-page IT emergency document untouched for three years is worthless in a real incident. People have changed, tools have changed, passwords are no longer current. Better a 5-page document everyone has understood and that is played through once a year.

Third pitfall: "DR equals backup". Backup is the data component, DR is the recovery component. A backup without DR plan is a pile of bits nobody can play back. A DR plan without backup is a play without a script. Both are needed.

Fourth pitfall: too tight values. Requiring RTO 5 minutes without hot standby is a deliberate lie in the concept. Anyone writing that has not calculated the plan – or worse, plans an outage where the stated values cannot be held.

Trade-offs

STRENGTHS

  • Clear expectation values toward clients and insurers
  • Annual exercise uncovers hidden stumbling blocks before the real incident
  • Warm standby at CHF 200 per month is worth it against one day of fiduciary outage
  • EU AI Act Art. 17 requires a documented risk management – DR is part of it
  • Duty-of-care record for management in a claim case

WEAKNESSES

  • DR concept documents go stale fast – ongoing maintenance needed
  • Annual exercise costs 1 to 3 days of personnel effort
  • Hot-standby effort is often underestimated: split-brain, quorum, synchronous replication
  • Warm-standby cost doubles server rental plus software licence

FAQ

What does a DR exercise really bring?

Empirical values: at the first exercise 40 to 60 percent of restore steps fail on the first attempt. Missing passwords, wrong IPs in the runbook, uninstalled dependencies, broken backups. Only after three to four annual exercises does the success rate exceed 90 percent. Anyone who never exercises has a 30 percent success rate in a real incident.

How does DR differ from BCM?

Business Continuity Management (BCM) is the umbrella discipline: how does the operation stay running during power outage, pandemic, or key-staff absence? Disaster recovery is the IT layer inside BCM. For an SME a lean BCM frame plus a detailed DR concept for IT is usually enough.

What was the lesson from the SwissSign outage in 2024?

Single-vendor risk is real. Anyone holding all certificates, DNS, SMTP, CDN with one vendor falls with that vendor. For critical components a second vendor is worth it, even at double the effort. Specifically: SwissSign plus Let's Encrypt in parallel, Cloudflare plus Quad9 DNS, Brevo plus a second SMTP provider.

How do I document RTO and RPO audit-ready?

Two documents: an IT emergency concept with the strategy and the values, plus per exercise a protocol with date, participants, measured times and deviations. Both documents are kept for 10 years, in parallel with the retention duty of business data. Auditors and insurers accept this documentation as a duty-of-care record.

Related topics

BACKUP · SECURITYBackup strategies 3-2-1 and 3-2-1-1-0: how to secure an SME audit-readyHETZNER · TECHHetzner as EU hosting for Swiss fiduciaries and SMEs: data centres, contracts, costMANAGED · SERVICEManaged Service & Monitoring: we keep it running, you use itGRAFANA · TECH STACKGrafana, Prometheus, Loki: monitoring stack for container apps and LLM workflowsART. 957a CO · COMPLIANCEArt. 957a CO and AI bookings: audit trail, GeBüV, and 10-year retentionEU AI ACT · COMPLIANCEEU AI Act 2026: high-risk duties from 2 August 2026 – what Swiss providers must do nowDOCKER · TECH STACKDocker orchestration for SMEs: docker-compose without Kubernetes overkill

Sources

  1. NIST SP 800-34 Rev. 1 – Contingency Planning Guide for Federal Information Systems · 2026-02
  2. BSI 200-4 – Business Continuity Management Standard · 2026-03
  3. AWS Disaster Recovery of Workloads on AWS – Recovery in the Cloud (Whitepaper) · 2026-04
  4. Veeam – Data Protection Trends Report 2026 · 2026-04
  5. Hetzner Cloud – Snapshots and Backups documentation · 2026-05

FITS YOUR STACK?

What this looks like in your business – a 30-minute intro call.

Book a call