Senior/Staff Machine Learning Engineer, Data Infrastructure
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Unity Vector builds an Data platform that powers insight, experimentation, attribution, and AI-driven decision-making across the company. Our systems operate at scale across batch and streaming data, supporting analytics, product intelligence, machine learning pipelines, and business operations. As data volume and complexity grow, our platform also supports large-scale model training, feature generation, and experimentation workflows that power production ML systems. To support this growth, we need strong technical ownership to ensure our ML pipelines remain reliable, scalable, and architecturally sound. We are seeking a senior data infra engineer to design and evolve the large-scale offline platform. This role focuses on building reliable infrastructure for generating data infrastructure, training datasets, and orchestrating data workflows. You will work closely with ML engineers and platform teams to ensure our pipelines can efficiently handle growing data volumes and increasingly complex training workloads. You will play a key role in shaping how model datasets are prepared to ensure the reliability, scalability, and performance of our data platform.
Responsibilities
- Develop infrastructure that supports both batch and stream big data processing using technologies such as Flink, Spark, Ray, etc.
- Design and operate large-scale data pipelines that generate training datasets used for machine learning training and experimentation
- Integrate data pipelines with workflow orchestration systems (e.g., Flyte, Airflow, or similar) to enable reliable multi-stage training workflows
- Improve reproducibility and observability of data pipelines through dataset validation, monitoring, and automated testing
- Optimize performance and resource utilization across distributed compute systems used for data processing
- Partner closely with ML engineers to enable efficient large-scale experimentation and model iteration
- Lead architectural improvements to ensure our offline data pipelines remain scalable, reliable, and cost-efficient
Requirements
- Experience working with distributed computing frameworks such as Flink, Spark, Ray for distributed data processing
- Experience building infrastructure for training data generation, dataset preparation, or ML feature pipelines
- Experience optimizing big data pipelines and infrastructure for cost efficiency
- Strong programming skills in Python and experience working with large-scale distributed workloads
- Experience with modern data infrastructure (data lakes, warehouses, orchestration systems, streaming platforms)
- Strong systems thinking, with the ability to reason about performance, scalability, reliability, and cost tradeoffs in distributed systems
- Proven ability to lead technical direction and influence architectural decisions across teams without formal authority
Benefits
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Unity? Share your experience