Senior/Staff Machine Learning Engineer - Health Evaluation - AI Teams (x/f/m)

External

Doctolib · Paris, France

Full-timeOn-site1mo ago

JavaKotlinLLMsMachine LearningPythonRails

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

At Doctolib, we're on a mission to transform how healthcare is delivered by harnessing the power of AI.
As a Senior/Staff Machine Learning Engineer, you'll play a key role in designing, implementing, and scaling the evaluation framework that ensures our AI Health Companion behaves safely, reliably, and helpfully for millions of patients and practitioners.
You'll join a cross-functional team of Machine Learning Engineers, Product Engineers, and Medical Experts to build robust evaluation pipelines for agentic AI systems - models capable of reasoning, planning, and interacting with complex healthcare data.
Your responsibilities include, but are not limited to:
Define and own the evaluation strategy for our AI agentic system - metrics, protocols, datasets, and tooling
Implement and maintain automated evaluation pipelines to monitor model quality, safety, and alignment across iterations
Run systematic experiments to assess reasoning, factuality, robustness, and user experience
Collaborate closely with model developers and research scientists to provide insights and drive iterative improvement
Contribute to research and internal knowledge sharing on LLM evaluation methodologies and best practices
About our tech environment
Our stack is composed of Rails, TypeScript, Java, Python, Kotlin, Swift, and React Native
We leverage AI ethically across our products to empower patients and health professionals. Discover our AI vision here !

Requirements

Before you read on - if you don't have the exact profile described below, but you feel this job description matches your skill set, we still encourage you to apply.
MSc or PhD in Computer Science, Machine Learning, Data Science, or related field
7+ years of hands-on experience working with large language models (e.g., GPT, Claude, Llama, or BERT-like architectures)
Proven experience in evaluating agentic or reasoning systems (e.g., autonomous agents, tool-using LLMs, dialogue systems, or task-oriented assistants)
Strong track record in experiment design, metric definition, and evaluation automation
Ability to bridge research and production, influencing modeling and product decisions
Excellent communication skills and a collaborative mindset
Now it would be fantastic if:
You have experience in the clinical or medical domain and sensitivity to ethical or regulatory challenges in healthcare AI

Benefits

Free health insurance for you and your childrenParent Care Program: receive one additional month of leave on top of the legal parental leaveFree mental health and coaching services through our partner Moka.careFor caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological supportWork from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policyWork Council subsidy to refund part of sport club membership or creative classUp to 14 days of RTTLunch voucher with Swile cardThe interview processRecruiter interviewTechnical Deep DiveData System DesignBehavioral InterviewAt least one reference checkJob detailsPermanent positionFull TimeWorkplace : Hybrid in our Levallois officeStart date: asapAll information provided is processed by Doctolib for application management. For data processing details, click here .Please contact hr.dataprivacy(at)doctolib.com for inquiries or to exercise your rights.Health insuranceVision insuranceRemote work optionsParental leave

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at doctolib? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect