Senior Data Engineer (F/M/D)
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
The Opportunity We're looking for a Senior Data Engineer to architect and scale the data backbone powering next-generation AI models in robotics and real-world environments. This role sits at the intersection of distributed systems, multimodal data processing, and applied machine learning, with a strong focus on building high-quality datasets for robotic foundation models. You will ensure that data pipelines, infrastructure, and data strategy directly translate into measurable improvements in model performance. Your Responsibilities Drive the model-data loop by connecting application requirements with data collection, and translating model failures into data-driven improvements through collection, curation, and augmentation Build and scale distributed data pipelines (Ray/Anyscale or similar) for TB-scale video, sensor, and robotics datasets Design multimodal data schemas aligning video, actions, and high-frequency sensor streams Develop Python tooling for data quality, including cleaning, anomaly detection, and dataset versioning Own dataset quality and coverage, including annotation workflows, data diversity, and storage trade-offs Lead a small team and coordinate with data providers and annotation vendors Oversee real-world data collection, including technical setup, compliance, and secure data handling Technologies Python (advanced, production-grade) Ray / Anyscale or Apache Spark AWS / GCP for large-scale data and GPU training pipelines Video and sensor data formats (H.264/H.265, ROS bags, MCAP) PyTorch, NumPy DVC, LakeFS or similar data versioning tools Distributed data processing and storage systems Must Have 5+ years in Data/ML Engineering, including 2+ years in a senior or lead role Experience with large-scale real-world data (robotics, autonomous systems, or video AI) Strong experience with Ray/Anyscale or Spark for distributed pipelines Advanced Python (performance, concurrency, ML stack like NumPy/PyTorch) Experience working with video and sensor data formats (e.g., H.264/H.265, ROS bags, MCAP) Experience building scalable data pipelines for GPU-based training workloads (AWS/GCP) Experience with data versioning tools such as DVC or LakeFS Proven experience owning systems and mentoring engineers Nice to Have Experience building datasets for multimodal foundation models (VLA, VLM or similar) Robotics fundamentals (sensor synchronization, 3D transforms) Experience with active learning or data-centric ML workflows Competitive compensation package Various employee subsidies and perks, including public transportation and Wellpass Work with a world-class team in a flat hierarchy, with direct collaboration alongside the founders and engineering team Opportunity to make a real impact by working on cutting-edge robotics and AI systems Fast growth potential in a rapidly evolving company and industry International office environment with English as the official working language Recruiting Process Your recruiting partner for this role is Madhulika (she/her). You can expect a screening call and up to 4 rounds of interviews including an onsite visit to our office in Munich to meet with the team. We hire across backgrounds, identities, and experiences, and we are committed to a workplace where everyone belongs. Discrimination has no place here. If you need any accommodations during the recruiting process, just reach out to your recruiting partner.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Animore? Share your experience