Director, Site Reliability Engineering

External

Doctolib · Berlin, Germany

Full-timeOn-site2d ago

AWSCDNComplianceDatadogGCPGDPR

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Build and run a world-class SRE org of 25+ engineers across Cloud Infrastructure, Database & Storage, Network Infrastructure, Observability Tooling, and the Doctolib Operations Center
Own the infrastructure strategy and roadmap - cloud, database, network, observability - and deliver against company OKRs
Lead the Doctolib Operations Center: set incident response standards, drive MTTR reduction, embed blameless post-mortem culture across engineering
Architect and execute our multi-cloud strategy - reducing vendor lock-in, cutting migration costs, and enabling international expansion
Own network infrastructure at scale: load balancing, CDN/WAF, VPCs, peering, zero-trust networking across a high-traffic, multi-country platform
Drive observability as a product - give 700+ engineers true visibility into system health and turn observability maturity into an operational excellence lever
Lead from the front as a senior technical voice in the Platform org and broader Tech leadership team

Requirements

12+ years in software engineering, including 5+ years leading managers and running infrastructure or SRE organisations at scale
Track record of taking SRE practices from reactive to proactive - with measurable reductions in incidents and MTTR
Strong multi-cloud and network infrastructure experience: load balancing, CDN/WAF, VPCs, peering, at high-traffic scale
Deep database operations background: large-scale transactional systems (PostgreSQL, Aurora), streaming/CDC (Kafka), data layer FinOps
Experience building observability platforms that give teams genuine visibility - metrics, logs, traces, alerting
Sharp process thinking: SLOs, error budgets, incident management, blameless post-mortems
Outcome-driven: you track reliability, cost efficiency, and engineering velocity as business metrics, not just technical ones
Strong communicator and influencer at executive level - equally credible with senior engineers and business stakeholders
Builder of high-performing, people-first engineering cultures
Fluent in English; comfortable in fast-paced, international environments
You recognise yourself in our playbook values
Bonus Points If You Have...
Experience in healthcare, regulated, or high-compliance industries (HDS, ISO 27001, SOC2, GDPR, data sovereignty)
Familiarity with our stack: Ruby on Rails, Node.js, Go, Python, React, AWS, GCP, Kubernetes, PostgreSQL, Datadog, GitHub Actions
French language proficiency
Experience with AI-augmented infrastructure tooling or ML platform operations
M&A or post-acquisition infrastructure integration experience

Benefits

A Deutschlandticket (Germany-wide public transport pass) fully paid for by Doctolib28 vacation days + 1 additional day for each full calendar year of employment (up to a maximum of 30 days)Work from abroad for up to 10 days per year thanks to our flexibility days policyCompany health insurance with great supplementary benefits through our partner AllianzCompany pension scheme (bAV) through Allianz with an employer subsidyof 40% (15% within the probationary period)Enrollment in Doctolib's long-term employee value sharing plan called DoctoGrowthThe Doctolib Parent Care program, which includes one month additional parental leaveHealth insurancePaid time offPerformance bonusParental leave

Additional Information

Why this role As our Director of Site Reliability Engineering , reporting to our VP of Platform Engineering, you'll own the core infrastructure layers that everything at Doctolib runs on: cloud infrastructure, database operations, network infrastructure, and observability . You will also lead the Doctolib Operations Center (DOC) and drive a decisive shift from reactive operations to a proactive, world-class reliability culture. This is a rare opportunity to shape the infrastructure backbone of Europe's leading healthtech company, at a moment when Doctolib is actively expanding multi-cloud capabilities, scaling to new countries , and building the reliability culture that will define the next decade of healthcare innovation. Why this is an extraordinary challenge Real stakes, every day. When Doctolib is down, consultations don't happen, diagnoses are delayed, care journeys are interrupted. The infrastructure you build is a direct lever on patient outcomes - in a world where 8 of the top 10 causes of death in Europe are preventable. A once-in-a-generation platform transition. Multi-cloud, monolith modularisation, international expansion - all happening simultaneously. You won't inherit a finished platform. You'll define what it becomes. Reliability as the competitive moat. As we scale AI health companions, automate clinical workflows, and launch across Europe, the speed and resilience of the platform directly determines how fast 700+ engineers can ship innovations that change healthcare. A cultural build, not just a technical one. The incident response culture, observability standards, and operational ownership model you establish here will shape how Doctolib engineers work for years to come.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at doctolib? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect