Senior Data Engineer Python/GCP (x/f/m)
ExternalContractOn-site5mo ago
BigQueryCI/CDComplianceDockerGCPJava
Prepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- We are looking for a Senior Data Engineer to join the AI Team working on our AI Medical Companion .
- Working in the tech team at Doctolib means building innovative products and features to improve the daily lives of care teams and patients.
- Your responsibilities include but are not limited to:
- Design, build, and maintain scalable data pipelines on Google Cloud Platform (GCP) for AI and machine learning use cases
- Implement data ingestion and transformation frameworks that power Retrieval systems and training datasets for LLMs and multimodal models
- Architect and manage NoSQL and Vector Databases to store and retrieve embeddings, documents, and model inputs efficiently
- Collaborate with ML and platform teams to define data schemas, partitioning strategies, and governance rules that ensure privacy, scalability, and reliability
- Integrate unstructured and structured data sources (text, speech, image, documents, metadata) into unified data models ready for AI consumption
- Optimize performance and cost of data pipelines using GCP native services (BigQuery, Dataflow, Pub/Sub, Cloud Storage, Vertex AI)
- Contribute to data quality and lineage frameworks, ensuring AI models are trained on validated, auditable, and compliant datasets
- Continuously evaluate and improve our data stack to accelerate AI experimentation and deployment
Requirements
- Before you read on: if you don't have the exact profile described below, but you feel this job description matches your skill set, we still encourage you to apply.
- You'll be a great fit if you:
- You have 5+ years of experience in Data Engineering, ideally supporting AI or ML workloads
- You have strong experience with the GCP data ecosystem and proficiency in Python and SQL
- You have deep understanding of NoSQL systems (e.g., MongoDB) and vector databases (e.g., FAISS, Vector Search)
- You have experience designing data architectures for RAG, embeddings, or model training pipelines
- You have knowledge of data governance, security, and compliance for sensitive or regulated data
- You are fluent in English
- It would be fantastic if you:
- You hold a Master's or Ph.D. degree in Computer Science, Data Engineering, or a related field
- You have familiarity with W&B / MLflow / Braintrust / DVC for experiment tracking and dataset versioning
- You have experience with containerized environments (Docker, Kubernetes) and CI/CD for data workflows
- Life at Doctolib Tech
- Our solutions are built on a single fully cloud-native platform that supports web and mobile app interfaces, multiple languages, and is adapted to country and healthcare specialty requirements.
- Our stack is composed of Rails, TypeScript, Java, Python, Kotlin, Swift, and React Native.
- We leverage AI ethically across our products to empower patients and health professionals. Discover our AI vision here .
- Want to learn more about our tech culture and environment? Visit the Doctolib Tech site .
Benefits
Free comprehensive health insurance (basic package) for you and your children25 days of paid vacation per year, plus up to 14 days of RTTFree mental health and coaching services through our partner Moka.careWork from abroad for up to 10 days per year thanks to our flexibility days policyLunch vouchers (Swile card) worth €8.50 per working day, with €4.50 covered by DoctolibA subsidy from the work council to refund part of the membership to a sport club or a creative class50% reimbursement of your public transport subscriptionParent Care Program: receive one additional month of leave on top of the legal parental leaveEnrollment in Doctolib's long-term employee value sharing plan called DoctoGrowthFor caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological supportRelocation support in case of international mobilityAccess to the best AI tools for coding, development and dedicated trainingOur interview processRecruiter InterviewTechnical Deep DiveSystem Design InterviewBehavioral InterviewAt least one reference checkWe want your experience to be clear, respectful, and transparent. Learn more about our hiring process on our candidate experience page .Job detailsPermanent positionTech stack: GCP, Python, SQL, NoSQL, Vector Databases, AI/MLFull-timeParis, FranceHybrid work setup (up to 2 remote days per week)Start date: as soonHealth insuranceVision insurancePaid time offRemote work optionsParental leave
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Doctolib? Share your experience