QUOTES · USE CASE
AI-assisted quote generation: 2 to 4 hours of work in 20 minutes
From inquiry -> RAG over past quotes (price ladders, standard clauses) -> clean draft in Bexio/Klara format. Case handler reviews and sends manually.
Researched & fact-checked by: DuneDive LLC · As of: 2026-05
What this is about
In many Swiss SMEs, quote creation is the big hidden time sink. A serious quote for a tradesperson mandate, an IT service, or a consulting project takes 2 to 4 hours: read the inquiry, find similar past mandates, calculate prices, gather standard clauses, pour it into the corporate layout, cross-check, send. A 6-person office doing this twice a week loses 16 to 32 hours per month – time that could flow into acquisition or delivery.
AI-assisted quote generation means: an inbound inquiry (mail, form, or client portal) is read, matched against past quotes and calculation templates, and a draft emerges in Bexio, Klara, or your own format. Prices come from the stored price ladder, standard clauses from the firm library. The case handler reviews, adds individual aspects, hits send.
Important: this is no auto-pilot. The human makes the commercial decision (discount, special request, relationship factors). The machine delivers the craftwork. Under this split the effort per quote drops from 2-4 hours to 15-30 minutes. For a 6-person IT service firm with 8 quotes per month that is 15 to 25 hours saved per month.
Frequent in trades, IT services, consulting, fiduciary mandate offers, advertising agencies. Less so in classical product retail where prices live in lists and no individual calculation is needed.
Why it matters
Three points make quote automation in 2026 one of the most lucrative SME use cases.
First: direct revenue lever. Unlike email triage or document capture the freed time here is immediately sellable – a second quote per week means more customers in the pipeline. If 10 quotes typically yield 4 wins (40% rate), every extra weekly quote adds revenue leverage. At an average order value of CHF 8,000 that is CHF 32,000 extra quarter revenue per additional weekly quote.
Second: response speed. In many industries response time decides who wins the mandate. A Tuesday inquiry with a Wednesday-morning quote beats a Tuesday inquiry with a Friday-afternoon quote. With AI preparation the handler can respond within 2 hours instead of 2 days – the difference between "fast" and "slow" in the inquirer's eyes.
Third: price-formation consistency. AI pulls from price history and makes two case handlers, working independently, comparable. This reduces the phenomenon where two similar mandates at the same firm are quoted at CHF 6,500 and CHF 8,200 depending on who is assigned. Consistent prices mean consistent margin.
The risk is the flip side of the lever: too-fast standard answers can damage relationship quality. An inquiry that says a lot between the lines ("we are dissatisfied with our current provider because…") deserves a handwritten reply, not a Bexio-template warm-up. The handler must spot these – that is precisely the human value-add.
How the pipeline works
The pipeline runs in five layers.
Layer 1 – intake. The inquiry arrives via mail, web form, or client portal. n8n triggers the workflow, assigns a case ID, stores the original inquiry and attachments in the audit log.
Layer 2 – inquiry understanding. A language model (Mistral Large for EU hosting, alternatively Claude Sonnet) reads the inquiry and extracts structured fields: service type, volume, the inquirer's industry, urgency, special requests, budget hints. Output is a JSON schema parameterising the next layers.
Layer 3 – RAG over past quotes. A Qdrant lookup searches the firm's quote library (typically 200 to 2,000 past quotes, plus the stored price-ladder table, plus the standard-clause collection). The retriever returns the 8 most similar past quotes and the clauses relevant to the service type. Important: only own quotes, nothing scraped from the web. The firm's pricing logic stays the firm's secret.
Layer 4 – draft generation. The language model assembles a quote draft from three inputs – the extracted inquiry, the similar quotes it found, the stored price ladder. Structured output in Bexio quote format (REST API, document type "offer"), Klara format, or a firm-specific Word/PDF template. Required fields are filled, prices computed, standard clauses inserted. A note block at the end lists places where the AI was uncertain or recommends a hand-signed decision.
Layer 5 – human review and dispatch. The draft appears in the Bexio documents area as status "draft" (Bexio API: POST /2.0/kb_offer). The handler opens, reviews, adjusts, accepts or rejects. On acceptance the quote goes via Bexio directly to the inquirer – with tracking (opened/accepted/declined). The audit log records every change between AI draft and final version, important for later learning.
Learning loop. Accepted quotes return with the human corrections to the Qdrant index. Over 3 to 6 months the pipeline learns the handler's typical adjustments and first-draft quality rises. Adjustments are anonymised (no client names) before appearing as retrieval examples.
Pipeline in 7 steps
- 01Intake: n8n captures inquiry from mail/web form/portal, assigns case ID, audit log.
- 02Structuring: Mistral Large extracts fields (service, volume, industry, urgency, special requests) as JSON.
- 03RAG lookup: Qdrant returns top-8 similar past quotes plus matching standard clauses.
- 04Price calculation: the stored price ladder + volume discounts applied numerically (not LLM-estimated).
- 05Draft: LLM assembles quote body, clause block, price table. Output as Bexio document via POST /2.0/kb_offer or Word/PDF template.
- 06Review: handler opens draft in Bexio, checks prices and clauses, adds personal elements.
- 07Dispatch and learning: on approval, send with Bexio tracking. Accepted quotes plus corrections move anonymised into Qdrant.
When to use
Quote automation pays off from about 4 to 6 quotes per month. Below that the setup overhead is not worth it. From 15 quotes per month the project typically amortises in 4 to 7 months – and the lever is revenue-affecting, not just cost-cutting.
Typical setups: IT services with 8 to 20 quotes per month for implementation and maintenance projects; trades with canton-specific tariffs and many renovation/conversion quotes; consultants and coaches with individual mandate offers; marketing and advertising agencies with project quotes; fiduciary offices preparing new-client offers; office-service providers with recurring service packages.
Well suited: offices on Bexio, Klara, or Run-my-Accounts (REST API standard), offices with at least 100 past quotes as a RAG base, offices with clear price ladders (hourly rates, module prices, volume discounts). Difficulty arises with highly individual projects without recurring patterns – there AI delivers less leverage.
An often-overlooked additional application: standardisation of contract templates. If a handler regularly offers "maintenance contract Gold/Silver/Bronze", AI can ensure all three tiers are textually consistent and price steps do not blur through copy-paste errors.
When not to use
Do not deploy if the quote library is too thin. Below 50 past quotes the RAG retrieval drops in quality and first drafts become unreliable. Here it makes more sense to spend six months digitising and tagging past quotes – then start the pilot.
Do not deploy if the office cultivates a very personal client relationship and standardisation would be perceived as a relationship downgrade. In some fiduciary mandates the handwritten quote with a handwritten personal note is part of the value-add. AI generation would be counterproductive here.
Do not deploy if the price ladder lives only in the owner's head, undocumented. AI cannot calculate from gut feeling. The first step is to document the pricing logic – which is anyway useful for succession planning and tax clarity.
Do not deploy without a mandatory approval step. Even when the draft looks visually perfect: an automatically dispatched quote without human review is a margin and reputation risk. Swiss SMEs ultimately get paid through trust, not speed alone.
Trade-offs
STRENGTHS
- Quote creation from 2-4h to 15-30 min – direct lever on revenue pipeline
- Faster response time vs. inquirers, competitive advantage in time-sensitive industries
- Consistent pricing across handlers, less margin variance
- Learning loop makes the pipeline noticeably better over 6 months
WEAKNESSES
- Setup requires a sorted quote library (at least 50 past quotes, ideally tagged)
- Price logic must be documented – head knowledge is not enough
- In very personal client relationships standardisation can damage rapport
- The approval step must be disciplined, otherwise margin and reputation risks emerge
FAQ
How do we ensure AI does not invent prices?
Prices are not generated by the LLM. The pipeline strictly separates: LLM delivers the structured inquiry analysis, a deterministic module (Python, JavaScript) applies the stored price ladder. The LLM may then embed prices in the draft – but never calculate them freely. Pre-go-live tests with synthetic inquiries verify that calculated prices match hand calculations.
What if the inquirer is in an industry or wants a service we have never quoted?
The pipeline detects this through low RAG similarity (cosine < 0.55). In this case no draft is produced. A structured note goes to the handler: "New constellation. No sufficiently similar past quote. Please quote manually and feed the result into the learning loop so subsequent inquiries are better served." This way the system learns actively from new cases.
How does the pipeline integrate with Bexio?
Via the official Bexio REST API. OAuth2 authorization-code flow yields an access token used to call POST /2.0/kb_offer. Pro accounts get unlimited API calls; Standard gets 100/day (as of 2026). The n8n Bexio node covers this. Klara, AbaConnect (Abacus) and Run-my-Accounts offer comparable REST APIs.
How long does roll-out take?
6 to 8 weeks. Weeks 1-2: review and tagging of past quotes (the costliest step, often overlooked). Week 3: documentation of the price ladder. Week 4: Bexio/Klara integration and n8n workflow. Weeks 5-6: first drafts in shadow mode. Weeks 7-8: active use with high handler scrutiny. Stable operation typically from month 3.
Related topics
Sources
- Bexio Developer Docs – REST API (kb_offer, OAuth2) · 2026-04
- Bexio – API-Request und Pricing (Pro vs. Standard) · 2026-03
- n8n.io – Bexio integration node · 2026-02
- Mistral AI – La Plateforme pricing Mai 2026 (Large 3 USD 2/6) · 2026-05
- TYTOS Schweiz – AI-Agents Praxis-Leitfaden 2026 (Offerten als Use-Case) · 2026-04