fairlane.systems

TWILIO · TECH

Twilio: the global telephony standard for programmable voice and SMS applications

Twilio offers programmable telephony and SMS as an API. US headquarters with EU region (Ireland, Frankfurt) for data residency. CH landline minute USD 0.0085. Standard for voice agents.

Researched & fact-checked by: · As of: 2026-05

What is Twilio?

Twilio is a 2008-founded San Francisco communications-platform-as-a-service company and the global de facto standard for programmable telephony. Listed (NYSE:TWLO) since 2016, in May 2026 with over USD 4B annual revenue and more than 300,000 active customer accounts. The product is proprietary, with extensive SDKs in Python, Node.js, Java, C#, PHP, Ruby, Go.

The core offering spans six product lines. Voice (incoming and outgoing calls, phone numbers in 100+ countries, Media Streams for real-time audio). Messaging (SMS, MMS, WhatsApp Business API, Verify for 2FA). Video (WebRTC conferencing). Email via SendGrid (acquisition 2019). Conversations (multi-channel inbox). Studio (no-code visual workflow builder).

For voice agents and AI telephony three components are central. Twilio Voice enables receiving and initiating calls via TwiML (XML-based instructions) or Voice SDK. Twilio Media Streams sends call audio in real time via WebSocket to your own server – the foundation for STT+LLM+TTS chains with Whisper and ElevenLabs. Twilio Programmable Voice with Speech Recognition offers built-in STT (Google-based), but Swiss German is not covered, so for CH applications Media Streams plus own pipeline is the standard.

Pricing is usage-based per minute. A Swiss landline inbound costs USD 0.0085/min, a Swiss mobile inbound USD 0.0150/min. Outgoing Swiss landline USD 0.0210/min, mobile USD 0.0640/min. Renting a Swiss phone number about USD 6/month. SMS within Switzerland USD 0.075. Media Streams per minute additional USD 0.004/min – the audio stream surcharge.

Important for Swiss applications: Twilio has EU regions (Ireland and Frankfurt) for data storage. A BDPA is standard in the Master Service Agreement, EU data residency is activated in the console.

Why it matters

A voice agent or SMS automation rises and falls with its telephony connection. There are roughly three ways into the Swiss phone network: via a classic ISP (Swisscom, Sunrise) with physical PRI line, via a SIP trunk provider (Voxbone, JustVoIP, some CH-specific providers), or via Twilio. The first two are established but complex to integrate and not codable. Twilio is API-first – a phone number is a REST call, a call is a TwiML response.

For Swiss fiduciaries and SMEs, Twilio in May 2026 is the only telephony platform that goes live within hours via code. An inbound number costs USD 6, a PoC voice agent runs in 2-3 days. With SIP trunks this takes weeks and requires specialised telecom consultants.

Media Streams is the real game-changer. Until 2021 voice AI ran through Twilio built-in STT (mediocre, no Swiss German) or through Recording (record call, transcribe offline – no real time). Media Streams sends audio live as a WebSocket stream to your own endpoint. An Express server can forward it to Whisper, Deepgram, or a custom STT model and generate replies in sub-second latency. That is what makes voice agents with Swiss German recognition possible in the first place.

The EU tier (Ireland, Frankfurt) is decisive for GDPR/revDSG compliance. Twilio EU means audio recordings, call metadata, and the phone-number database stay in EU data centres. A BDPA with Twilio is standard. For professional-secrecy data (lawyer, fiduciary, doctor) Twilio EU is therefore an acceptable choice – revDSG requirements can be met.

Cost structure is transparent. A typical voice agent for a Swiss fiduciary with 100 inbound calls per month at 3 minutes costs about USD 4 for inbound minutes plus USD 6 for the number plus USD 1.20 for Media Streams = USD 11.20/month. Plus LLM and TTS costs. A custom PBX setup would be three to four figures per month.

How it works

The inbound call flow is conceptually simple. A caller dials the Twilio number. Twilio sends a webhook POST to the configured Voice URL of a server. The server replies with TwiML – an XML document with instructions like <Say>, <Play>, <Gather>, <Connect>.

Example TwiML for a live voice agent with Media Streams:

<?xml version="1.0" encoding="UTF-8"?> <Response> <Connect> <Stream url="wss://my-app.example.com/voice-stream"> <Parameter name="caller" value="{{From}}"/> </Stream> </Connect> </Response>

This makes Twilio open a WebSocket stream to wss://my-app.example.com/voice-stream and send audio frames (u-law PCM, 8 kHz) and event frames. An Express or FastAPI server takes the stream, forwards it to Whisper, builds the LLM reply, generates TTS with ElevenLabs, and sends the resulting audio back as stream frames to Twilio. Twilio plays the audio reply to the caller.

For outbound calls (cold outreach, reminders): POST /Calls/{call_sid}/messages with From, To, Url. Twilio dials the target number, on connect it fetches the Voice URL and follows TwiML – same mechanism as inbound.

SMS send: POST /Messages with From, To, Body. Response comes as webhook on the Messaging URL – incoming text in plaintext, plus sender phone number. Multi-turn conversations need own session logic (e.g. Redis for state per phone number).

WhatsApp Business API via Twilio: similar interface to SMS, with a sandbox phase for tests, then approval of the phone-number-ID by Meta. Twilio submits the application to Meta. On approval, the message schemas extend (template messages, buttons, lists).

The Studio visual builder lets you build voice and messaging flows via drag and drop – for simple IVR structures faster than code, for complex logic inflexible. In May 2026 pros use code, Studio for prototypes.

Monitoring and compliance: Twilio Insights offers real-time metrics (call volume, failure rate, latency). Twilio Recordings (call recording) is optional – for professional-secrecy data, carefully check whether recording is allowed. Twilio Trust Hub allows caller-ID verification and A2P-10DLC registration in the US.

Twilio setup in 5 steps

  1. 01Sign up at twilio.com, activate EU region (Ireland or Frankfurt) in account settings, review BDPA from Master Service Agreement, set up Trust Hub if caller-ID verification is needed.
  2. 02Buy a Swiss number via console or REST API (POST /IncomingPhoneNumbers, about USD 6/month). Point Voice and Messaging webhook URLs to your own endpoints.
  3. 03Write a TwiML server: incoming webhook receives POST with From/To/CallSid, responds with XML instructions. For voice agent: <Connect><Stream url="wss://..."/></Connect>.
  4. 04Media Streams endpoint: WebSocket server (Express + ws library) takes audio, forwards to Whisper/Deepgram, builds LLM reply, sends TTS back as stream frames. Latency target under 1.5 seconds.
  5. 05Monitoring and cost: Twilio Insights for real-time metrics, Telegram alarm at failure rate above 2 percent, budget cap in account for cost control (e.g. USD 100/day limit).

When to use Twilio

Twilio in May 2026 is the default choice for programmable telephony and messaging in the SME segment. Concrete cases: a voice agent for inbound client calls at a fiduciary with Swiss German recognition – Twilio Media Streams plus Whisper pipeline. An SMS reminder automation for appointment confirmations – Twilio Messaging with webhook integration to the practice software. An international sales team needs local numbers in 5 countries for outbound calls – Twilio Numbers in those countries, one codebase.

For WhatsApp business bots, Twilio is alongside the direct Meta Cloud API the second standard path. Twilio adds an abstraction layer and simplifies multi-channel setups (WhatsApp + SMS + Voice in one platform). Direct path via Meta is cheaper for pure WhatsApp, Twilio more sensible for multi-channel strategy.

For 2FA and identity verification, Twilio Verify is the easiest solution: POST with phone number, Twilio sends OTP via SMS or voice and validates the reply. Also a tested building block for industries with FINMA awareness or AML onboarding.

When not to use

For pure Swiss domestic telephony at high volume (more than 10,000 minutes/month) local SIP trunk providers (Sipgate, JustVoIP CH) are cheaper – Twilio USD 0.0085/min inbound is not the cheapest at volume. At very high volume a self-managed SIP trunk is also economic.

For very simple IVR applications without an AI component (classic "press 1 for ...") Twilio is overkill – a classic PBX or a simple SIP provider does the job. Twilio makes sense only when programmable logic and code control matter.

For strict on-premise requirements (defence, high security) Twilio is not suitable – cloud providers are off-limits. Here you need Asterisk or FreePBX setups on own hardware with own SIP trunk.

For pure marketing email campaigns, Twilio is not first choice – SendGrid (Twilio sub-brand) is okay, but Brevo or Mailgun often have better CH/EU deliverability.

For industries under FINMA supervision with special recording duties, verify before using Twilio whether Twilio EU meets the specific audit requirements (e.g. audit-grade recording storage). Twilio Recordings are not WORM – anyone needing that must add external archival.

Trade-offs

STRENGTHS

  • API-first, voice agent live in 2-3 days instead of weeks with SIP trunks
  • EU region (Ireland, Frankfurt) with BDPA for CH/EU data residency
  • Media Streams enables Swiss German voice agents with custom STT pipeline
  • Multi-channel platform (voice, SMS, WhatsApp, email) in one account

WEAKNESSES

  • Proprietary, no SME-affordable self-host
  • Per-minute price at high volume more expensive than local SIP trunks
  • Recordings not WORM – extra solution needed for FINMA audit requirements
  • Built-in STT does not recognise Swiss German – own pipeline required

FAQ

Is Twilio revDSG-compliant for CH fiduciaries?

With EU region (Ireland or Frankfurt) and BDPA: yes, in most cases. Audio recordings and call metadata stay in EU data centres. For professional-secrecy data (Art. 321 StGB) a client consent to electronic communications processing is recommended. For recording duty (FINMA) check additional audit storage solution – Twilio Recordings are not WORM.

How much does a voice agent with Twilio cost per month?

Example CH fiduciary with 100 calls at 3 minutes each: USD 6 (number) + USD 4 (inbound minutes landline) + USD 1.20 (Media Streams) = USD 11.20 for telephony. Plus LLM (USD 5-15) and TTS (USD 5-30 depending on provider). Total USD 20-60/month for 100 client calls.

Twilio Media Streams or Twilio Speech Recognition?

For CH applications with Swiss German always Media Streams plus own Whisper pipeline – Twilio built-in STT (Google-based) does not recognise Swiss German. Twilio Speech Recognition is enough only for English and High German without dialect.

WhatsApp via Twilio or directly from Meta?

Direct from Meta (Cloud API) for pure WhatsApp use, cheaper and with less latency. Via Twilio for multi-channel strategy (WhatsApp + SMS + Voice in one application) or when the team already uses Twilio. Both need Meta approval for the phone number ID.

Related topics

VOICE · SERVICEVoice agent on the phone: AI that calls and is calledBOT & VOICE · TOOL COMPARISONBuilding blocks for chat and voice bots compared: Whisper, Deepgram, ElevenLabs, Piper, Twilio, Vapi, Retell, WhatsApp, Rasa, BotpressWHISPER · TECHWhisper: OpenAI open-source STT model for multilingual transcriptionDEEPGRAM · TECHDeepgram: proprietary STT API with the lowest latency in the marketELEVENLABS · TECHElevenLabs: the industry reference for natural TTS voices and voice cloningPIPER TTS · TECHPiper: the open-source local TTS system for privacy-sensitive applicationsBOTS · SERVICEWhatsApp & Telegram bot: AI answering on the channels your clients actually use

Sources

  1. Twilio Voice – Media Streams documentation · 2026-05
  2. Twilio pricing – voice and messaging per-minute rates · 2026-05
  3. Twilio Trust Center – GDPR, BDPA, EU data residency · 2026-04
  4. Twilio Status Page – uptime and incident history · 2026-05

FITS YOUR STACK?

What this looks like in your business – a 30-minute intro call.

Book a call