FAQ · AI FOR SMES

AI FAQ for SMEs: 35 frequent questions on practice, law and cost

Answers to the most frequent AI questions from Swiss SMEs: cost, data protection, tools, hallucinations, FINMA, EU AI Act. As of May 2026.

Researched & fact-checked by: DuneDive LLC · As of: 2026-05

What this FAQ covers

This FAQ collects the 35 questions that Swiss fiduciary, legal and SME clients have asked us most often between early 2025 and May 2026. Answers are 2-4 sentences long, as of May 2026, and where useful they link to deeper knowledge pages.

The topic mix covers six areas. Cost and economics (what does an AI solution cost for a 5-person office, how fast does a pilot pay back, what does an own server cost). Data protection and law (client data on ChatGPT, FADP compliance, DPIA obligation, FINMA, EU AI Act, liability). Tool choice (GPT-4o vs Claude, Apertus, multimodal, open-weight vs closed). Security and operations (ChatGPT outage, hallucinations, prompt injection, AI incident, shadow AI, audit trail). Use cases (Bexio integration, M365, tax return, dunning, mail triage). Implementation (self-hosting, fine-tuning, staff training).

Why these questions matter

Swiss SMEs are buying AI at scale for the first time in May 2026. Consulting vendors are growing fast, many deliver half-knowledge. Whoever does not know what to ask buys wrong or pays too much.

The 35 questions here are exactly those we receive in first conversations before a project starts. Answering them is the cheapest risk reduction. A wrongly answered FADP-compliance question can trigger a supervisory notification; one on hallucinations can destroy client trust.

Second, the FAQ saves time. Instead of buying an hour of consulting for each question, find the answers here in one document. For deeper questions we point to the dedicated knowledge page.

How to use the FAQ

The FAQ is organised by topic blocks (cost, law, tools, security, use cases, implementation). Read the blocks relevant to you in their given order – answers build partly on each other (pilot cost → payback → self-hosting).

Three recommendations.

First: answer the FAQ jointly with your leadership team. Which questions have you already resolved, which not? Which answer fits your context, which needs further review? Duration about 90 minutes.

Second: check vendor proposals against the FAQ answers. If a vendor promises "FADP-compliant" but runs no DPIA, the answer here matters.

Third: maintain your own internal FAQ version every six months. The AI market moves fast; May 2026 is not November 2026.

Use the FAQ in 5 steps

01Pick a topic block (cost, law, tools, security, use cases, implementation) and skim the related questions.
02For each relevant question, read the 2-4 sentence answer and follow the link to the deep page if you need more.
03Check vendor proposals against the FAQ answers – where a vendor deviates, request written clarification.
04Walk through answers with the leadership team (90 minutes), collect open points for follow-up consulting.
05Create your own internal FAQ version with company-specific answers; maintain it every six months.

When to use the FAQ

The FAQ is useful at four moments.

Before a first meeting with an AI vendor. Read the relevant answers, formulate follow-up questions. You save a consulting fee on standard topics.

Before a management decision. When the board decides on budget or tool choice, all participants need the same knowledge baseline. The FAQ is 60 minutes of reading.

Before staff communication. Anyone informing staff about new AI tools should know the FAQ answers. Staff ask the same questions.

After an incident. When something went wrong (hallucination in a client letter, data leak via ChatGPT), the FAQ provides first response steps and points to the incident playbook.

When the FAQ is not enough

The FAQ is a first orientation tool. Three limits.

For the concrete OpenAI or Anthropic contract, the FAQ answer on "data at OpenAI" is not enough. The DPA must be read and matched against your client contracts – a job for a data-protection officer or lawyer.

For a FINMA-licensed firm, the FINMA answer here is not enough. Banks and insurers need licence-compliant architecture documentation. The FAQ flags the obligation; the implementation is a project.

For a concrete price quote the "what does X cost" answer here is not enough. Answers give ranges (CHF 1500-6000, CHF 12000-25000). A concrete quote needs a 60-minute scoping session with your systems and volumes.

If you cannot find one of the 35 questions here, send it to us – we typically reply within one business day and add standard questions to the next FAQ version.

Trade-offs

STRENGTHS

Quick answers to the 35 most frequent entry-level questions
Current with May 2026 prices, models and regulation
Each answer links to a deeper knowledge page
Saves consulting fees on standard topics

WEAKNESSES

Answers are generic – no substitute for project-specific consulting
Price ranges are wide; a concrete quote needs scoping
May 2026 state; half-yearly refresh recommended
Industry specifics (FINMA, lawyers, medical) require deeper review

FAQ

What does AI cost for a 5-person fiduciary office?

Realistically the entry sits at CHF 1500-3500 per month: ChatGPT Business or Claude Team seats (around CHF 30/person), a RAG pilot for client documents (one-time setup CHF 8000-15000), and CHF 200-500/month running cloud cost. Without RAG entry is possible below CHF 500/month. See was-kostet-ki-automation-kmu.

May I send client data to ChatGPT?

Not casually. You need a data-processing agreement with OpenAI (available in the Business and Enterprise plans), a documented data classification and, under professional secrecy (SCC 321), additionally a client consent or pseudonymisation. In the free and Plus tier it is clearly not admissible for personal client data. See berufsgeheimnis-stgb-321-ki and ndsg-revfadp-ki.

Is Apertus a real alternative to Claude?

Apertus is the Swiss open-weight model from ETH/EPFL (released 2025, refreshed 2026). It is strong on German, French and Italian but below the current top Claude model and GPT-4o on top-end reasoning. For sovereignty-driven use cases (FINMA, government) it is a serious option; for pure quality leadership it is not. See apertus-swiss-ai-modell.

How fast does a RAG pilot pay back?

Typical RAG pilots pay back in 4-9 months if the use case is right (client FAQ, internal knowledge search, dunning). A CHF 12000 pilot saving one hour of clerk time per day (rate CHF 80) pays back in 7 working months. Use-case selection drives this – a wrong use case never pays back. See was-kostet-rag-pilot.

What happens during a ChatGPT outage?

OpenAI outages in May 2026 typically last 15-90 minutes and occur 2-4 times per quarter (see status.openai.com history). Business-critical use cases need a fallback via LLM gateway: switch automatically to Claude or Mistral on OpenAI outage. See was-ist-llm-gateway and multi-llm-routing-strategien.

May I use AI in a FINMA-supervised business?

Yes, but under the FINMA "operational risks" circular plus the AI-specific guidance (2025). Mandatory elements include documented risk assessment, model governance, audit trail and outsourcing review for cloud vendors. See finma-ki-rundschreiben and finma-awareness.

How much time does AI mail triage save per day?

Field measurements at fiduciary and law offices show 30 to 90 minutes per clerk per day. Prerequisite: a RAG index over the last 24-36 months of client correspondence plus classification into categories (deadline-relevant, standard query, fee discussion). At 8 clerks that is 4-12 hours per day or 80-240 hours per month. See email-triage-automation.

How do I prevent AI hallucinations?

Reduce, not eliminate. Three layers: (1) RAG with mandatory citations in the system prompt, (2) post-answer citation check (are the sources actually in the retrieval result?), (3) human-in-the-loop for any output sent to clients. The third layer is non-negotiable for professional-secrecy industries. See halluzinationen-begrenzen.

Do I need a DPIA for AI?

Under revFADP Art. 22 a data-protection impact assessment (DPIA) is mandatory when the processing carries a high risk for affected persons. An LLM application with personal client data typically meets that threshold. Effort: 1-3 days for a clean first draft, followed by annual update. See dpia-für-ki-systeme.

Which languages does AI understand well?

GPT-4o, the current top Claude model and Gemini 2 are essentially native-level in German, French, Italian, English. Romansh is weaker but increasingly usable. Swiss German dialect remains challenging in May 2026 – Whisper plus a language model is the standard setup for Swiss voice. See ki-und-schweizer-mehrsprachigkeit-de-fr-it.

How safe is my cloud data in Switzerland?

With Swiss vendors (Swisscom, Infomaniak, Exoscale, Safe Swiss Cloud) data is processed under Swiss law and by Swiss personnel. With US vendors (AWS, Azure, GCP) in EU or CH region, the US Cloud Act may apply, which FINMA and the FDPIC consider a risk. See swiss-cloud-souverän-hosten and drittlandtransfer-tia.

What distinguishes GPT-4o from Claude?

GPT-4o is stronger on multimodality (image, audio, live video) and general tool use; the current top Claude model is stronger on long context, code and German writing quality. As of May 2026 many Swiss SMEs use both in parallel and route tasks via LLM gateway. See openai-vs-anthropic-vs-mistral.

Which GPU do I need for Llama 4?

Llama 4 70B 4-bit quantized runs usably on a single RTX 4090 (24 GB VRAM) or RTX 5090 (32 GB) at about 8-15 tokens per second. For multi-user workloads, two H100 (80 GB) or one H200 (141 GB). In cloud: Runpod or Lambda Labs from CHF 1.50/h. See gpu-kosten-rechner.

What happens to my data at OpenAI?

In the Business and Enterprise plans, OpenAI by default does not store your API data for model improvement; retention is 30 days for abuse review, with Zero-Data-Retention in Enterprise also 0 days. In Free and Plus the default is different – opt-out needed. Storage location can be set to EU or US region. See dsgvo-und-llms.

How do I measure AI quality?

With an eval suite: 50-300 test cases from your business, each with an expected answer. On every model update you measure automatically accuracy, faithfulness (RAG) and latency. May 2026 standards are LangFuse, Helicone or Promptfoo. See eval-frameworks-für-llms and ki-qualität-kpis-rag.

Which AI tools are FADP-compliant?

There is no official certification list. A tool becomes FADP-compliant through your configuration: DPA in place, EU or CH region selected, zero-data-retention enabled, audit trail set up, RBAC defined. Mistral La Plateforme (EU), Anthropic with EU workspace, OpenAI Enterprise with ZDR and EU region, plus Swiss hosting vendors are common building blocks. See ndsg-revfadp-ki.

What does an own AI server cost?

An SME-grade GPU server (1x RTX 5090 or 2x RTX 4090) costs CHF 9000-18000 acquisition plus CHF 200-500/month power at full utilisation. Cloud comparison: equivalent cloud inference around CHF 1500-3500/month. Break-even after 6-12 months of full utilisation. See cloud-api-vs-selfhost-break-even and was-kostet-eigenes-llm.

How do I integrate AI with Bexio?

Via the Bexio REST API plus an MCP server or an n8n workflow. Typical use cases: pull client data from Bexio into the RAG index, capture receipts via OCR and import them pre-booked into Bexio, draft dunning letters automatically. SME setup effort: 5-15 days. See integration-bexio-api.

What does the EU AI Act mean for me as a Swiss firm?

If you offer AI products or services on the EU market, or if EU citizens receive your AI outputs, the EU AI Act applies directly. Staggered entry into force 2025-2027; general-purpose AI obligations apply from August 2026. Even without EU market exposure we recommend compliance, since Switzerland is likely to adopt EU rules with a lag. See eu-ai-act-2026 and eu-ai-act-kmu-fristen-2026.

When do I need fine-tuning?

Rarely. For 90% of SME use cases a good base model plus RAG delivers better results than fine-tuning. Fine-tuning pays off when (a) domain vocabulary is weak in the base model, (b) a consistent output format cannot be stabilised by prompt, (c) the task is clearly measurable with a large training set. See was-ist-fine-tuning-vs-rag.

Can AI create tax returns?

Prepare, not finalise. AI can read receipts, pre-book entries, run plausibility checks and draft enclosures. The final tax return responsibility lies with the fiduciary; four-eyes review remains mandatory. See ai-steueroptimierung-entwurf.

How do you train staff?

In three tiers. Tier 1 (60 minutes, all staff): basic terms, data-protection rules, what is allowed in which tool. Tier 2 (4 hours, users): prompt engineering, tool operation, typical errors. Tier 3 (2 days, champions): use-case building, eval maintenance, shadow-AI detection. Repeat every six months given model updates.

What is shadow AI?

Staff using AI tools on their own initiative – usually free ChatGPT in the browser with client data in the prompt. A data-protection and professional-secrecy risk, while often productive pioneers. The response is not a ban but a sanctioned alternative plus a clear rule. See schatten-ki-im-unternehmen.

How do I respond to an AI incident?

Four steps in the first 24 hours: (1) stop the tool and pull client data out of the processing path, (2) preserve the audit trail and log the incident, (3) on personal data leakage assess the 72h FDPIC notification duty, (4) prepare client communication when professional secrecy is concerned. Detailed playbook see incident-response-playbook.

What is the difference between closed and open weight?

Closed-weight: model weights are not public (GPT-4, Claude, Gemini). You use the models only via the vendor API. Open-weight: weights are downloadable (Llama, Mistral, Apertus). You can run the model locally or at any cloud vendor yourself. The typical trade-off is sovereignty vs top-end quality. See trend-open-weight-vs-closed.

Do I need multimodal AI?

If you process receipts via OCR, parse contracts with tables, or build voice telephony – yes. If you only process text (dunning, correspondence, note search), you do not need multimodal. GPT-4o and the current top Claude model are multimodal by default in May 2026; the premium is small. See was-ist-multimodal-ki.

When is your own LLM worth it?

When utilisation exceeds 8 GPU-hours per day, when sovereignty is mandatory (FINMA-licensed processing, sensitive professional-secrecy data) or when use cases are extremely latency-critical. Below 8 GPU-h/day cloud is cheaper. See self-hosted-vs-cloud-llm.

How do I prevent prompt injection?

Three lines. First: clearly separate the system prompt, do not let user input be interpreted as instructions. Second: input filters (OWASP LLM Top 10) against known patterns. Third: no tool use with sensitive actions (sending money, deleting client data) without human confirmation. See red-teaming-für-ki.

What is MCP?

MCP (Model Context Protocol) is the open standard published by Anthropic in November 2024 for how LLMs access tools and data sources. Adopted by OpenAI, Google and Microsoft in May 2026. Over 500 public MCP servers exist. For SMEs it means: one tool, many models – no lock-in. See was-ist-mcp.

How important is the audit trail?

Mandatory in regulated industries (fiduciary under Art. 957a CO, banks under FINMA, lawyers under bar rules). Strongly recommended elsewhere – without an audit trail, model errors cannot be reconstructed. Per AI request to log: user, model, version, prompt, source, answer, timestamp. See ai-audit-trail-design and art-957a-or-audit-trail.

Who is liable for an AI error?

You as the user. The vendor is liable only for technical contractual duties (availability, data security). For a content error (hallucination in a client letter) you are liable to the client. Four-eyes review and an eval suite are your liability reduction. See wer-haftet-bei-ki-fehlern.

Which AI helps in dunning?

A RAG system over the last 24-36 months of client correspondence plus a workflow builder (n8n, Make) drafts dunning letters automatically: appropriate dunning level, context-aware text, attached receipts. Expected time saving: 60-80% of clerk time. See ai-mahnwesen-automation.

How do I integrate AI with M365?

Via the Microsoft Graph API plus, if you already license Copilot, the native Copilot integrations in Outlook, Word and SharePoint. For custom use cases, an MCP server against the Graph API; this gives you access to mail, calendar, SharePoint and Teams from Claude or GPT. See integration-microsoft-365-graph-api.

Which LLM is cheapest?

Per token in May 2026 the ranking (input/output per 1M tokens): DeepSeek V3 around CHF 0.20/0.90, Gemini Flash around CHF 0.10/0.40, GPT-4o-mini around CHF 0.13/0.60, Claude Haiku around CHF 0.70/3.50, premium tier (GPT-4o, the current top Claude model) around CHF 2.50-15. For highly sensitive data "cheap" matters less than data sovereignty. See token-kosten-erklärt.

Is self-hosting feasible for SMEs?

Yes, but realistically only from 15-25 active AI users onward or under a hard sovereignty requirement. Stack: Ollama or vLLM on an RTX-5090 or H100 server, plus LiteLLM gateway, plus Qdrant for RAG. SME setup effort: 5-10 days; ongoing about 2-4 hours per month. Below 15 users cloud is usually cheaper and simpler. See self-hosted-vs-cloud-llm.

Sources

EDÖB – Stellungnahme zu künstlicher Intelligenz und Personendaten · 2026-05
FINMA – Aufsichtsmitteilung "Operationelle Risiken und KI" · 2026-04
EU AI Act – Stand der Umsetzung 2026 · 2026-05
OpenAI – Enterprise Privacy & Data Processing Addendum · 2026-05
Anthropic – Commercial Terms and Data Processing · 2026-05

FITS YOUR STACK?

What this looks like in your business – a 30-minute intro call.

Book a call