Speechmatics

Name: Speechmatics
Rating: 8.1 (1 reviews)

🇬🇧

Cambridge speech AI with 55+ language transcription and voice agent API

8.1/10

EuropeanGDPREU DataFree Tier

Review by EuropeanStack EditorialUpdated May 2026Verified May 2026

Visit Speechmatics

Bottom Line

8.1/10

Speechmatics has spent two decades building what Deepgram and AssemblyAI built in four: an ASR system that handles human speech in its actual, messy, multilingual reality. The addition of Flow as a voice agent API in 2024 transformed the product from a transcription service into a platform. On-premise deployment, ISO 27001 certification, and genuine 55+ language accuracy make it the strongest European option for enterprise speech AI. The limitations are real — it is a developer product with no consumer interface, mid-range per-hour pricing for English-only workloads, and a Pro tier cap that requires Enterprise conversations sooner than some teams expect. For the use cases where Speechmatics is the right fit, the fit is very good.

Speechmatics is a Cambridge-based speech AI company providing automatic speech recognition, real-time transcription, and voice agent technology across 55+ languages and dialects. Its Flow API enables developers to build voice-interactive AI agents with sub-second latency and enterprise-grade accuracy.

Headquarters

Cambridge, United Kingdom

Founded

2006

Pricing

Freemium

EU Data Hosting

Yes

Employees

201-500

speech-to-textvoice-aitranscription-apireal-timemultilingualenterprise

Ratings

Ease of Use7.5

Feature Depth9.0

Value for Money7.5

EU Compliance8.5

Support Quality8.0

Integration Ecosystem7.5

Features

Core Features

✓Automatic speech recognition (ASR) via REST API
✓Real-time transcription via WebSocket
✓Batch transcription for pre-recorded audio
✓55+ languages and dialects
✓Speaker diarization (multi-speaker identification)
✓Custom vocabulary and dictionary support
✓Text-to-speech (TTS) API
✓Punctuation and formatting inference
✓Confidence scores per word
✓Profanity filtering and content redaction

Standout Features

★Flow voice agent API for real-time speech-to-speech AI interactions
★Enhanced and Standard model tiers for accuracy-vs-speed trade-off
★Medical speech-to-text with 93% accuracy (NVIDIA-powered, Sept 2025)
★On-premise and private cloud deployment options for data sovereignty
★LangGraph and MCP server integrations for AI agent workflows

Compliance

☖UK GDPR compliant
☖ISO 27001 certified
☖On-premise deployment available
☖Multi-region cloud hosting
☖Data processing agreements available

Pricing

Free

480 minutes/month speech-to-text
1M characters/month text-to-speech
2 concurrent real-time sessions
55+ languages
No credit card required

Pro

Pay-as-you-go

From $0.24/hour pay-as-you-go
480 free minutes/month included
1M free TTS characters/month
50 concurrent real-time sessions
Up to 6,000 hours/month
20% discount above 500 hours/month

Enterprise

Contact Sales

No rate limits or monthly caps
Custom volume pricing
On-premise and private cloud deployment
Custom model and voice development
Dedicated Customer Success Manager
Solutions Engineer support

Billing: pay-as-you-go, annual

Integrations & API

LangChainLangGraphPython SDKJavaScript / React SDKMCP serversMicrosoft AzureREST APIWebSocket

API AvailableWebhook Support

Support

EmailDedicated-csmDocumentationDocs: ExcellentCommunity Forum

Pros

✓55+ languages and dialects including regional accents with industry-leading accuracy (93% in medical speech)
✓Flow voice agent API enables full speech-to-speech interactions with sub-second latency
✓On-premise deployment available for air-gapped and sovereign cloud requirements
✓480 free minutes per month plus 1M free text-to-speech characters — generous free tier for developers
✓ISO 27001 certified with UK GDPR compliance and multi-region cloud deployment

Cons

✕Developer-only product with no consumer-facing interface; requires API integration work
✕Pro tier capped at 6,000 hours/month — large-scale broadcast or media customers need Enterprise
✕Pricing at ~$0.24/hour is mid-range; Deepgram can be cheaper for English-only high-volume workloads
✕Language support quality varies by language tier — some regional dialects have lower accuracy than core languages

Frequently Asked Questions

Yes. Speechmatics is a UK company operating under UK GDPR, which mirrors EU GDPR requirements. The company is ISO 27001 certified and offers on-premise deployment for organisations that require data to stay entirely within their own infrastructure.

Speechmatics supports 55+ languages and dialects, including regional accents often ignored by competing ASR providers. In September 2025, the company achieved 93% accuracy on medical speech-to-text. Accuracy varies by language tier — core languages like English, Spanish, and German perform at the highest level.

Flow is Speechmatics' voice agent API that combines real-time speech recognition, a large language model, and text-to-speech into a single speech-to-speech pipeline. It handles interruptions, multiple speakers, and background noise naturally. Developers integrate it via Python, React, and JavaScript SDKs and can connect it to LangGraph agents and MCP servers.

Speechmatics' pay-as-you-go rate starts at $0.24/hour. Deepgram offers competitive English-only pricing, but Speechmatics' multi-language accuracy and the Flow voice agent API provide a broader capability set. For English-only high-volume workloads, Deepgram may be cheaper; for multilingual or voice agent use cases, Speechmatics' value proposition is stronger.

Yes. Enterprise tier customers can deploy Speechmatics entirely on their own infrastructure — including air-gapped environments. This is one of Speechmatics' key differentiators for regulated industries such as defence, finance, and healthcare where data cannot leave the organisation's network.

What Is Speechmatics?

The automatic speech recognition market is dominated by American companies. Deepgram, AssemblyAI, and OpenAI's Whisper API have captured developer mindshare with aggressive pricing and English-first accuracy. Speechmatics, founded in Cambridge in 2006, has quietly built an alternative — one with deeper language coverage, a voice agent API with no US equivalent in capability terms, and deployment options that keep data entirely within sovereign infrastructure.

The company traces its roots to Dr. Tony Robinson's research at Cambridge, where recurrent neural networks were first applied to speech recognition in the late 1980s. That academic lineage has shaped the company's technical culture: Speechmatics has consistently prioritised accuracy and language breadth over marketing spend. It registered as Speechmatics Limited (Companies House 07037524) and raised $90.6 million in total, including a $62 million Series B in June 2022 led by Susquehanna Growth Equity.

The customer base reflects this positioning. Broadcasters, healthcare systems, and enterprise software platforms rely on Speechmatics for production transcription. In 2025, AI-Media processed over 79 million caption minutes through Speechmatics infrastructure. The medical speech-to-text model, launched in September 2025, reached 93% general accuracy on real-world clinical audio — a benchmark that matters when transcription errors affect patient care.

Key Features

ASR with 55+ Languages and Dialects

Most speech-to-text providers support transcription in a handful of dominant languages and treat dialects as an afterthought. Speechmatics takes a different view. The 55+ language portfolio includes regional dialect variants that broader tools miss — accents that cause transcription failures in competitor systems produce consistent results in Speechmatics. A broadcaster covering local news in multiple European markets, or a call centre handling multilingual customer interactions, encounters fewer failure cases as a result.

Two model tiers — Enhanced and Standard — let developers trade accuracy for speed and cost. Enhanced delivers best-in-class accuracy; Standard processes faster at lower cost. For batch transcription of archival media, Standard may be sufficient. For real-time captioning where accuracy is broadcast-critical, Enhanced is the appropriate choice.

Flow Voice Agent API

Flow is the product that positions Speechmatics beyond the transcription category. The API combines real-time ASR, a large language model, and text-to-speech into a complete speech-to-speech pipeline. Developers build voice-interactive AI agents — customer service bots, clinical documentation assistants, in-car interfaces — without assembling separate STT, LLM, and TTS services and handling the latency of passing data between them.

Flow handles the conversational complexity developers would otherwise need to build themselves: interruptions, multiple simultaneous speakers, background noise suppression, speaker-aware responses (addressing people by name, ignoring background voices). Python, React, and JavaScript SDKs are available. The API connects to LangGraph agents and MCP servers for teams building more complex AI orchestration pipelines.

The 2024 launch of Flow represents Speechmatics shifting from an ASR data provider to a voice AI platform. That distinction matters for developers evaluating long-term infrastructure choices.

Medical and Specialised Speech

In September 2025, Speechmatics demonstrated 93% accuracy on general medical speech-to-text, powered by NVIDIA hardware. This was achieved alongside healthcare customers who reported returning 30 million minutes to the clinical workforce through automated documentation. For health IT teams, this is the specific number that unlocks procurement decisions — not a marketing claim, but a measured workflow impact.

Specialised vocabulary support (custom dictionary and formatting preferences) extends this principle across other domains. Legal transcription, financial call recording, and technical support conversations all have vocabulary that generic models mishandle. Custom dictionary configuration addresses this without requiring a full custom model training engagement.

On-Premise and Sovereign Deployment

This is the feature that separates Speechmatics from Deepgram and AssemblyAI at the enterprise level. Enterprise customers can deploy Speechmatics entirely within their own infrastructure, including air-gapped environments where data never travels to an external network. Multi-region cloud deployment options satisfy data residency requirements for organisations operating under EU, UK, or sector-specific data sovereignty mandates.

Deepgram does not offer comparable on-premise deployment. AssemblyAI is US-hosted by default. For a defence contractor, a national health service, or a financial institution operating under strict data localisation policies, the on-premise option shifts Speechmatics from "comparable product" to "only viable option."

Developer Experience

Speechmatics' documentation is comprehensive, well-maintained, and covers both basic REST API integration and complex voice agent architectures. The free tier provides 480 minutes of speech-to-text per month plus 1 million text-to-speech characters — enough for meaningful development and testing work before a paid commitment. No credit card is required to start.

The SDK coverage (Python, React, JavaScript) targets the languages most developers actually use for AI integration work. WebSocket support enables real-time streaming applications without polling.

Pricing

Speechmatics operates a usage-based model after the free tier. Pro pricing starts at $0.24/hour with a 20% automatic discount kicking in above 500 hours per month. The 480 free minutes per month persist on paid accounts, meaning small-volume users effectively receive partial subsidy on their monthly bills.

At $0.24/hour, Speechmatics is mid-range in the ASR market. Deepgram's Nova-2 model can reach lower per-minute costs for English-only workloads at high volume. For multilingual workloads or voice agent development where Flow replaces three separate API subscriptions, the Speechmatics cost structure becomes competitive on a feature-adjusted basis.

Enterprise pricing is custom and includes volume discounts at 24,000+ hours annually, dedicated Customer Success Manager support, Solutions Engineer access, and on-premise deployment. The Pro tier is capped at 6,000 hours/month — organisations exceeding that threshold move to Enterprise automatically.

The Startup Program offers $50,000 in API credits with dedicated onboarding support, which is meaningful for well-funded startups building voice AI products on top of Speechmatics infrastructure.

EU Compliance and Privacy

Speechmatics is a UK company operating under UK GDPR, which directly mirrors EU GDPR requirements. ISO 27001 certification demonstrates a documented security management system independently audited. Data processing agreements are available, satisfying the contractual requirements of regulated-sector procurement.

The on-premise deployment option is the strongest compliance feature in the catalogue. No other speech AI provider at this scale offers comparable sovereign deployment flexibility. For organisations processing healthcare audio, legal proceedings, or classified government communications, on-premise deployment eliminates the fundamental data sovereignty concern rather than mitigating it.

Post-Brexit, Speechmatics' UK jurisdiction may require additional assessment for EU-based organisations operating under strict data transfer regulations. The company offers multi-region cloud options, including EU-hosted deployment, which addresses this for most use cases.

Who It's Best For

If you are building voice-interactive AI agents and need a single API for the full speech-to-speech pipeline, Flow removes significant integration complexity. Deepgram and AssemblyAI offer ASR; Speechmatics offers the complete voice agent stack.

If your application requires multilingual transcription including regional dialects, Speechmatics' 55+ language portfolio outperforms competitors on coverage and consistency. English-first tools fail on European language breadth.

If data sovereignty is non-negotiable — defence, healthcare, government — on-premise deployment makes Speechmatics the only enterprise-grade option in the European speech AI market.

If you are a developer evaluating ASR APIs, the free tier's 480 minutes per month is among the most generous in the category and requires no credit card commitment.

The Verdict

Frequently Asked Questions

Is Speechmatics GDPR compliant?

Yes. Speechmatics operates under UK GDPR, which mirrors EU GDPR requirements. The company holds ISO 27001 certification and offers on-premise deployment for complete data sovereignty. EU-region cloud hosting is available for EU-based organisations with data transfer concerns.

What languages does Speechmatics support?

Speechmatics supports 55+ languages and dialects, including regional accent variants. In September 2025, its medical speech-to-text model achieved 93% accuracy on real-world clinical audio. Accuracy is highest in core languages (English, Spanish, German, French); regional variants and less common languages perform at varying accuracy levels.

What is the Flow API and who should use it?

Flow is a speech-to-speech API combining real-time ASR, LLM reasoning, and text-to-speech in a single pipeline. It is designed for developers building voice-interactive AI agents — customer service systems, clinical documentation tools, or any application requiring natural spoken conversation. Python, React, and JavaScript SDKs are available.

How does Speechmatics pricing compare to Deepgram?

Speechmatics Pro starts at $0.24/hour with a 20% discount above 500 hours/month. Deepgram's Nova-2 model can be cheaper for high-volume English-only transcription. For multilingual workloads or voice agent development (where Flow replaces three separate services), Speechmatics' feature-adjusted cost is competitive. The free tier of 480 minutes/month compares favourably to Deepgram's offerings.

Can Speechmatics be deployed on-premise?

Yes. Enterprise customers can deploy Speechmatics entirely within their own infrastructure, including air-gapped environments with no external network access. This option is not available on Deepgram or AssemblyAI at comparable scale, making Speechmatics the de facto choice for regulated industries with strict data sovereignty requirements.

Speechmatics is an EU alternative to

Deepgram Assemblyai Openai

Related Products

Adaptive ML🇫🇷

LLM optimisation and deployment platform for enterprise AI

EU-BuiltCustomReviewed

Alternative to Openai

Visit Website

Aleph Alpha🇩🇪

Sovereign AI for European enterprises and government institutions

EU-BuiltCustomEU DataReviewed

Alternative to Gemini, Openai

Visit Website

DeepL🇩🇪

AI-powered translation that outperforms Google Translate in quality

EU-BuiltFreemiumEU DataReviewed

Alternative to Google Translate

Visit Website

Hopsworks🇸🇪

European open-source feature store and ML platform for the data-for-AI lifecycle

EU-BuiltFreemiumOpen SourceEU DataReviewed

Visit Website

What Is Speechmatics?

Key Features

ASR with 55+ Languages and Dialects

Flow Voice Agent API

The 2024 launch of Flow represents Speechmatics shifting from an ASR data provider to a voice AI platform. That distinction matters for developers evaluating long-term infrastructure choices.

Medical and Specialised Speech

On-Premise and Sovereign Deployment

Developer Experience

The SDK coverage (Python, React, JavaScript) targets the languages most developers actually use for AI integration work. WebSocket support enables real-time streaming applications without polling.

Pricing

The Startup Program offers $50,000 in API credits with dedicated onboarding support, which is meaningful for well-funded startups building voice AI products on top of Speechmatics infrastructure.

EU Compliance and Privacy

Who It's Best For

If data sovereignty is non-negotiable — defence, healthcare, government — on-premise deployment makes Speechmatics the only enterprise-grade option in the European speech AI market.

If you are a developer evaluating ASR APIs, the free tier's 480 minutes per month is among the most generous in the category and requires no credit card commitment.