Skip to content
All Case Studies
Case Study / Sotto

Sotto
voice AI that answers UK restaurant phones.

.NET 10. Groq Llama 4 Scout. Whisper Large v3 Turbo. Deepgram Aura 2. Engineered to a sub-500 millisecond time-to-first-audio budget, with the UK Big 14 allergens enforced as a mandatory conversation state.

29
.NET 10 projects
609
Unit tests
<500ms
Voice latency budget
UK
Regulatory focus
Production-ready, in the UK restaurant pilot stage. Named-customer references published only with written permission.
What is the Sotto case study

A UK restaurant voice AI, documented stage by stage.

The Sotto case study shows how a phone call placed to a UK restaurant becomes a confirmed POS order without a human taking the call. The architecture is a 29-project .NET 10 monorepo with Clean Architecture, 609 unit tests, and a published sub-500-millisecond time-to-first-audio budget.

The vertical is UK restaurants specifically. That choice determines the regulatory shape (Big 14 allergens enforced as a conversation state, VAT as integer pence, GDPR per-tenant retention, HMRC 7-year financial retention) and the integrations (Square UK, Toast, Clover, Stripe UK, Uber Direct, Stuart).

The Challenge

Restaurants miss calls and lose orders.

Phone orders still drive a meaningful share of restaurant revenue. During peak hours those calls go unanswered. Staff are stretched thin, language barriers frustrate customers, and every missed call is a lost order.

Missed calls during peak hours

Staff cannot answer every call when the kitchen is slammed. Callers hang up and order elsewhere.

Allergen liability on a busy line

Since Natasha's Law, allergen diligence is existential for UK food businesses. A rushed human can skip the question; a state machine cannot.

Front-of-house labour scarcity

Hiring and retaining phone-capable staff is harder than ever. Wages rise while margins shrink.

Latency budget

Sub-500 milliseconds, stage by stage.

Voice AI has zero tolerance for delay. A one-second pause feels like an eternity on a phone call. Sotto's budget is broken down per stage rather than quoted as a round-number marketing figure. Sum across all ten stages is 470 milliseconds.

StageKindms
Twilio PSTN ingressCarrier50
WebSocket ingress (VoiceGateway)Internal5
VAD plus buffer drainInternal5
Mu-law to PCM (8 to 16 kHz)Internal5
Groq Whisper Large v3 TurboModel150
gRPC to orchestratorInternal5
pgvector RAG searchInternal15
Groq Llama 4 Scout 17B TTFTModel50
Deepgram Aura 2 TTFAModel130
WebSocket egress and PSTNCarrier55
Sum470

Integration surface

Self-contained by design, POS-native by default.

Sotto is the deliberately self-contained product in the KaritKarma catalog: it sells standalone to UK restaurants, so it integrates the market's native stack directly and runs its own auth. Domain code focuses on voice and ordering.

Twilio

Voice and SMS

Media Streams WebSocket ingress for live call audio, plus a 2-way SMS conversation pipeline and SMS payment links through the same conversation engine.

Groq + Deepgram

AI pipeline

Whisper Large v3 Turbo transcription and Llama 4 Scout reasoning on Groq; Deepgram Aura 2 speech with a streaming token bridge and barge-in support.

Square, Toast, Clover

POS write-back

Square UK, Toast (REST v2.5), and Clover (v3) connectors behind one common interface. The confirmed order lands on the POS the kitchen already runs.

Stripe UK

Payments

Checkout sessions, payment links, and SMS payment links in integer pence, with Uber Direct and Stuart dispatchers for delivery. Dashboard auth is Sotto's own NextAuth plus JWT.

UK regulatory shape

Big 14, integer pence, GDPR on a schedule.

UK rules shape the conversation machine, the money math, and the retention policy. None of this is bolted on; each one is a first-class concern in the codebase.

UK Big 14 allergens, enforced

AllergenCheck is a mandatory state in the conversation machine, not a checkbox. The AI enumerates allergens per item and asks about caller allergies before any order can confirm.

VAT in integer pence

Standard 20 percent, Zero 0 percent, Reduced 5 percent. Stored as basis points and money values as integer pence so VAT never drifts a rounding penny.

GDPR on a daily schedule

Per-tenant retention (default 365 days). Daily 02:00 UTC purge. Right-to-erasure anonymises calls, customers, and transcripts in one transaction. HMRC 7-year financial retention is preserved.

Sotto vs the alternatives

Phone call. POS write. Done.

Versus a human server with a notepad, a touch-tone IVR, or a chatbot on the website, here is what the architecture does differently.

CapabilitySottoHuman staffTouch-tone IVRWeb chatbot
Sub-500ms first-audio budgetVariableWeb only
Available 24/7
Natural phone conversationText
Big 14 allergen enforcementMandatory stateHopefully
Writes to your POSSquare / Toast / CloverManualLimited
Reads full menu accuratelypgvector RAGMemoryTone tree
GDPR retention per tenantVaries

What ships today

Production-ready, in active UK pilots.

Answers every call, 24/7

The AI picks up around the clock, with caller spam protection (hourly and daily rate limits plus risk scoring) built in.

2-way SMS ordering

The same conversation engine answers text messages, and Stripe payment links are delivered by SMS.

Front-of-house freed for service

Staff focus on hospitality and table service instead of phones during peak.

Sub-500 millisecond voice budget

A published stage-by-stage engineering budget, made realistic by the streaming token bridge to Aura 2 with barge-in.

GDPR on a schedule

Per-tenant retention with a scheduled purge (02:00 UTC default, configurable) and a right-to-erasure anonymiser that keeps HMRC records intact.

Direct market-native integrations

Stripe UK, Square / Toast / Clover, Twilio, Groq, Deepgram, Uber Direct, Stuart. Auth is Sotto's own NextAuth plus JWT.

Frequently asked

Sotto, asked plainly.

What is the Sotto case study?
The Sotto case study documents how a UK-focused voice AI takes restaurant phone orders end-to-end. Sotto is built as a .NET 10 monorepo of 29 projects with Clean Architecture and 609 unit tests plus 29 Testcontainers integration tests. The voice path runs on Groq Llama 4 Scout 17B for reasoning, Whisper Large v3 Turbo for transcription, and Deepgram Aura 2 for speech, engineered to a sub-500 millisecond time-to-first-audio budget. The case study covers the latency budget, the eight-state conversation machine (with mandatory AllergenCheck for the UK Big 14), POS integrations (Square UK, Toast, Clover), payments (Stripe UK in integer pence), and per-tenant GDPR retention.
Is Sotto live in production?
Sotto is built and tested as a production-ready platform with 609 unit tests covering the voice pipeline, order flow, POS connectors, and payment links, all re-verified green at the date of this revision. It is in the UK restaurant pilot stage. We do not promote pilots to general-availability claims, so we label Sotto as production-ready in pilots rather than as a multi-customer SaaS roster. Named-customer references are added only with written permission.
How is the sub-500 millisecond voice budget engineered?
The figure is a published stage-by-stage engineering budget, not a measured production percentile, and we label it that way. The design that makes the budget realistic: a streaming token bridge batches LLM tokens at sentence boundaries and ships them to Deepgram Aura 2 TTS before the full Llama 4 Scout response is generated, hiding time-to-first-token behind time-to-first-audio, with TTS barge-in supported. Voice activity detection runs at a 50 RMS energy threshold with a 700 millisecond silence cutoff, and Whisper Large v3 Turbo on Groq is budgeted at 150 milliseconds for transcription, leaving headroom for the carrier legs.
Does Sotto use the KaritKarma platform services?
No, and we say so plainly: Sotto is the deliberately self-contained product in the catalog. As a UK restaurant platform sold standalone, it runs its own NextAuth plus JWT authentication in the merchant dashboard and integrates its market's native stack directly: Twilio Media Streams for voice and 2-way SMS, Groq for Whisper transcription and Llama 4 Scout reasoning, Deepgram Aura 2 for speech, Square UK, Toast, and Clover POS connectors behind a common connector interface, Stripe UK for checkout sessions, payment links, and SMS payment links, and Uber Direct plus Stuart for delivery dispatch. KaritKarma platform services are a fit where customers share our ecosystem; Sotto's UK buyers do not, so it does not pretend otherwise.
Is Sotto compliant with UK food and data regulations?
Yes. AllergenCheck is a mandatory state in the conversation machine, enforced in code so the only path to order review runs through it. VAT is calculated per item in basis points (Standard 20 percent, Zero 0 percent, Reduced 5 percent) and stored as integer pence, so totals never drift by a rounding penny. GDPR retention is per-tenant configurable (default 365 days, conversations purged after 90 days by default) with a scheduled purge at a configurable hour (02:00 UTC default) and a right-to-erasure anonymiser that preserves orders and payments for the HMRC 7-year retention rule.
Where does Sotto run and what is the deployment model?
Sotto deploys in a UK-region envelope for data-residency reasons. Telephony runs on Twilio Media Streams with webhook routing. The voice plane is five .NET 10 services in 4-layer clean architecture, with RabbitMQ 4 over MassTransit for messaging. The application database is PostgreSQL 18 with pgvector for menu RAG (bge-large-en-v1.5 embeddings over an HNSW index), Redis 8 handles ephemeral state, and the stack ships with OpenTelemetry, Jaeger, Prometheus, and Grafana wired in. Production routing is Traefik with Let's Encrypt TLS per host.

Explore Sotto

Voice AI that never misses a call.

See how Sotto answers, takes the order, enforces allergens, and writes to Square, Toast, or Clover before the caller hangs up.