Staff Backend Engineer, Voices

External

Synthesia · Europe

Full-timeRemote1d ago

DatadogGenerative AIMiroPrototypingRecommendation Systems

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

You will work on the core speech and voice generation experience at Synthesia, building the platform that sits at the critical path of script creation and video generation. You will design and deliver features across the script preview and voice orchestration stack, combining frontend user experiences with backend platform reliability. This includes integrating with multiple Text-to-Speech (TTS) providers, building recommendation systems, and ensuring consistency and quality across all voice outputs. You will take ownership of features from idea through to production, working with loosely defined requirements to scope, prototype, and ship solutions that deliver real user impact. You will build across the stack, including: Backend systems for TTS provider orchestration, handling fallbacks, retries, and load-shedding across multiple providers Frontend experiences that allow users to preview scripts, select voices, and control pronunciation with intuitive interfaces (frontend experience is not a must!) Voice discovery and recommendation systems that guide users to high-quality voices and help them iterate quickly You will frequently work on 0 to 1 problems, such as building new voice quality frameworks, improving voice recommendations across languages, and introducing new TTS capabilities where experimentation and iteration are critical to success. You will collaborate closely with product, design, and AI teams to: Translate user problems into experiments and features Evaluate what works reliably across different TTS providers and reliability constraints Iterate quickly based on feedback, user testing, and voice quality data You will ship features behind feature flags, measure outcomes, and continuously refine based on product and user signals.

Requirements

You have experience building and shipping product features end-to-end in production environments.
You have a strong product mindset and can take ambiguous problems, define scope, and iterate towards solutions that deliver user value.
You are comfortable working in 0 to 1 environments, experimenting, prototyping, and learning quickly rather than relying on detailed upfront specs.
You can evaluate technical feasibility and make pragmatic trade-offs, especially when working with external systems (like TTS providers) and evolving requirements.
You care about reliability and user experience, aiming to build features that work consistently in real-world usage.
You are confident collaborating with product and design, including pushing back when something is not feasible and proposing better alternatives.
You are willing to debug and work across the stack, wherever the problem is.
Experience with audio/speech systems, TTS, API orchestration, provider integrations, or quality evaluation frameworks is a plus, but not required.
Why join us?
We're living the golden age of AI. The next decade will yield the next iconic companies, and we dare to say we have what it takes to become one. Here's why.
Our culture
Serving 50,000+ customers (and 50% of the Fortune 500)
We're trusted by leading brands such as Heineken, Zoom, Xerox, McDonald's and more. Read stories from happy customers and what 1,200+ people say on G2.
Proprietary AI technology
Since 2017, we've been pioneering advancements in Generative AI. Our AI technology is built in-house, by a team of world-class AI researchers and engineers. Learn more about our AI Research Lab and the team behind.
AI Safety, Ethics and Security
AI safety, ethics, and security are fundamental to our mission. While the full scope of Artificial Intelligence's impact on our society is still unfolding, our position is clear: People first. Always. Learn more about our commitments to AI Ethics, Safety & Security.
The hiring process
30-40min

Additional Information

Synthesia is the world's leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London, with offices and teams across Europe and the US. As AI continues to shape the way we live and work, Synthesia develops products to enhance visual communication and enterprise skill development, helping people work better and stay at the center of successful organizations. Following our recent Series E funding round, where we raised $200 million, our valuation stands at $4 billion. Our total funding exceeds $530 million from premier investors including Accel, NVentures (Nvidia's VC arm), Kleiner Perkins, GV, and Evantic Capital, alongside the founders and operators of Stripe, Datadog, Miro, and Webflow.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at synthesia? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect