Skip to main content
Back to jobs

Principal Engineer, Model Development Platform

External
wayve logoWayve · Sunnyvale
Full-timeOn-site1d ago
AirflowAPI DesignCross-functional CollaborationData ModelingDesign SystemsFastAPI
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems. Our vision is to create autonomy that propels the world forward. Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving. In our fast-paced environment big problems ignite us-we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future. At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact. Make Wayve the experience that defines your career! As Principal Engineer for the Model Development Platform, you'll own the end-to-end architecture behind Wayve's AI model lifecycle, from data ingestion and training to experiment scheduling and on-road testing. Working at the intersection of AI research, large-scale distributed systems, and robotic operations, you'll keep the platform reliable, scalable, and coherent so our researchers and engineers can iterate fast and deploy autonomous driving models safely. Partnering with the Head of Model Dev Platform, you'll set and execute the technical vision, aligning infrastructure and tooling with company goals. You'll lead by example, going deep across web applications, distributed compute, ML Ops, data pipelines, and optimization algorithms, and through architecture and mentorship you'll enable teams to build platform capabilities that measurably accelerate model development and fleet learning.

Responsibilities

  • System architecture & reliability - Design and evolve the platform's overall architecture for reliability, observability, and scalability. Set performance, latency, and availability targets, and drive the engineering standards to meet them.
  • Cross-domain technical leadership - Unify the platform across disciplines, from front-end UIs and distributed training to Spark data pipelines and optimization-based experiment scheduling, ensuring systems interoperate cleanly.
  • Hands-on problem solving - Dive into the hardest challenges across subteams, lead architectural reviews, and propose pragmatic solutions that balance innovation with operational simplicity.
  • Experimentation & scheduling systems - Build systems that optimize how models are tested in simulation and on-road, using techniques like linear programming and heuristic optimization to balance hardware, safety, and research priorities while improving throughput and turnaround.
  • Data & compute infrastructure - Architect pipelines that ingest, transform, and enrich petabytes of fleet sensor data, and drive efficient compute use across GPU, CPU, cloud, and edge for both prototyping and large-scale training.
  • Strategic collaboration - Partner with Product, Research, and Operations to align architecture with user needs and co-own the platform's long-term roadmap.
  • About You
  • Essential
  • Technical Leadership at Scale - 10+ years of experience designing and building large-scale distributed systems, ML/AI infrastructure, full stack web application, or developer platforms, including at least 3 years as a staff or principal-level engineer.
  • Architectural Depth & Breadth - Proven ability to design systems spanning web platforms, ML pipelines, and large-scale compute orchestration (e.g., Spark, Ray, Kubernetes , Airflow, MLflow).
  • Reliability and performance - Experience driving platform reliability improvements, defining SLAs/SLOs, and building self-healing and observable systems that operate at "four nines" availability or better.
  • Hands-On Systems Design - Deep understanding of distributed computing, workflow orchestration, data modeling, and API design, with the ability to write and review production-quality code.
  • Collaborative Influence - Excellent communication and cross-functional collaboration skills; ability to guide engineers, managers, and researchers toward unified technical direction.
  • Mentorship & Culture - Demonstrated success in mentoring engineers across levels and cultivating a culture of engineering excellence.
  • Desirable
  • Optimization & Scheduling Expertise - Experience applying algorithmic or mathematical optimization (e.g., linear programming, graph algorithms) to operational or scheduling problems.
  • ML Ops & Experimentation Systems - Familiarity with end-to-end model lifecycle tooling, from data ingestion and training CI to model artifact tracking and evaluation workflows.
  • Domain Experience - Prior exposure to autonomous systems, robotics, or other safety-critical domains.
  • Full-Stack Fluency - Experience with modern web frameworks (e.g., React, Flask, FastAPI)

Benefits

Vision insurance

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at wayve? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect