
Share
To demonstrate our performance under real-world conditions, Synthesys Research Labs tracked response time percentiles over a full week of live production deployment. The chart above displays P50, P90, and P99 response times from August 10 to August 17, 2025. These percentiles reflect a widely accepted approach for performance benchmarking. P50 represents the median, meaning half of all responses were faster than this threshold. P90 reflects high-performance consistency—90% of responses fell under this value. P99 captures tail latency and helps evaluate worst-case but still typical behavior. This percentile-based model is far more insightful than averages, which can obscure operational realities. As latency researchers note, “average response times can be misleading, especially in systems with high variability” (Medium).
Over the course of the observed week, Synthesys consistently maintained a median response time between 80ms and 170ms. These predictable oscillations follow daily usage patterns. Nights were quieter, and mid-days showed higher activity. What stands out is the system’s resilience and ability to maintain sub-200ms performance even under pressure. The P90 line hovered reliably under 200ms throughout the observed period. This means that even in high-concurrency scenarios, Synthesys responded within a fifth of a second in 9 out of 10 cases. For industries like real estate, healthcare, and e-commerce, where speed directly impacts conversions, customer experience, or patient trust, this matters.
The P99 line paints a picture of Synthesys’ behavior in edge-case scenarios. These are the rare instances where networks strain or processing stacks encounter temporary latency, yet Synthesys never exceeded 300ms at the tail end. That level of consistencym spanning across millions of live calls reflects deep engineering in edge inference, fallback logic, autoscaling, and high availability routing.
In telecom, performance isn’t measured by the best-case; it’s measured by consistency. Studies show that delays in phone response are one of the biggest drivers of customer frustration. In fact, 53% of users expect a reply within 3 minutes or less, and 33% cite wait times as their top complaint (Sobot). Even more telling, while email and ticketing systems often deliver replies in hours, users now expect live voice interfaces to respond within seconds, or less (SuperOffice). Synthesys consistently delivers sub-200ms responses, placing us not just ahead of human teams, but ahead of traditional IVRs and rule-based systems. This isn’t cosmetic speed, it’s core infrastructure speed, designed for real-time conversation routing and AI output delivery.
To place this in enterprise context: the median user hears near-instant responses with P50 under 170ms. You remain responsive at scale with P90 under 200ms. And you stay stable under strain with P99 under 300ms. These aren’t vanity metrics. They reflect a platform that supports time-sensitive use cases such as emergency routing, lead triage, and appointment confirmations where every millisecond affects outcomes.
Synthesys measures latency from the moment a webhook or routing trigger is received to the point when the AI-generated voice response begins rendering. These times are collected through Synthesys' internal analytics system and validated via our edge network and global autoscaling infrastructure. Response times are logged in real-time across deployments spanning industries such as healthcare, financial services, customer support, and logistics. We calculate our P50, P90, and P99 metrics using live traffic data, not simulations or staged benchmarks. As performance experts emphasize, “latency percentiles provide a more accurate representation of user experience by accounting for the distribution of response times” (DZone).

Time is money. Make more of both.