Associate Director, MLOps Engineering

External

Pathai · Boston (onsite) Preferred, New York (onsite), OR Remote

Full-timeRemote1mo ago

AirflowAWSAzureCI/CDClinical TrialsCompliance

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

PathAI's mission is to improve patient outcomes with AI-powered pathology. Our platform promises substantial improvements to the accuracy of diagnosis and the efficacy of treatment of diseases like cancer, leveraging modern approaches in machine learning and artificial intelligence. We have a track record of success in deploying AI algorithms for histopathology in translational research, pathology labs and clinical trials. Rigorous science and careful analysis is critical to the success of everything we do. Our team, composed of diverse employees with a wide range of backgrounds and experiences, is passionate about solving challenging problems and making a huge impact on patient outcomes. Where You Fit As the Associate Director, MLOps Lead, you will lead the team responsible for the backbone of our AI/ML Stack: the infrastructure that bridges ML research and massive-scale production. Your primary directive is to evolve our stack to meet the next scale of needs in large scale ML training & inference workloads. You're someone who enjoys designing and building for reliability, relishes collaboration and technical challenges, and takes pride in making things better - without taking yourself too seriously. Our technical space is broad: high-scale AI training & inference workloads, cloud infrastructure, Kubernetes, observability, distributed systems, and a bit of everything in between.

Responsibilities

This role is critical for driving the scalability and efficiency of our Machine Learning Operations platform with high-impact & high growth strategic initiatives.
Vision and Roadmap: Develop and execute the long term vision & roadmap for MLOPs team to support ML development and deployment needs across the business units. Successfully manage the tension between short-term tactical deliveries and long-term architectural transformation for future growth.
Team Management: Lead and mentor a team of 6-7+ high-performing engineers. Strategically allocate resources to manage support for existing services while executing key strategic initiatives.
Cross-Functional Collaboration: Partner with leaders across machine learning, data science, product engineering, and infrastructure to proactively identify pain points, address bottlenecks, and facilitate the deployment of new solutions.
Foundation Model Readiness: Architect the compute and storage pipelines required for ML Engineers to manage millions of slides and complex derived artifacts without data fragmentation or synchronization latency.
Inference Modernization: Modernize the AI Product inference stack to support 5-10x growth of AI runs across global deployments.
System Observability: Collaborate with Site Reliability Engineering (SRE) to establish comprehensive metrics covering compute under-utilization, network bottlenecks, and granular cost and turn-around-time attribution.
Technology Refresh: Conduct "Build vs. Buy" assessments, leading "Stack Refresh" audits to benchmark our proprietary tools against best-in-class commercial and open-source alternatives to meet our future needs.
What You Bring
To be successful in this role with us, you'll at least need:
Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience).
2-3+ years of experience managing engineering team(s), with a focus on building production-grade frameworks for MLOps or ML Infrastructure.
Deep technical expertise with ML workloads on kubernetes, cloud computing platforms (AWS/GCP/Azure), workflow orchestration (Airflow, Kubeflow, or proprietary equivalents) and DevOps principles and infrastructure-as-code (Helm, Terraform).
Proven experience managing petabyte-scale datasets and high-throughput production inference pipelines.
Strong software engineering skills in complex, multi-language systems and experience with scalable service architecture.
Use of AI assistants (e.g. CoPilot, Cursor, Claude) across platform development lifecycle.
It Would Be Great If You Also Have
Exposure to ML frameworks like PyTorch or Scikit-learn.
Experience with large-scale data processing frameworks (e.g. Spark, Hive, Databricks, Amazon EMR)
Expertise in MLOps principles, including model lifecycle management, feature stores, model monitoring, and CI/CD for ML.
Familiarity with security and compliance best practices in ML systems.
We Want To Hear From You
PathAI is an equal opportunity employer, dedicated to creating a workplace that is free of harassment and discrimination. We base our employment decisions on business needs, job requirements, and qualifications - that's all. We do not discriminate based on race, gender

Benefits

Vision insurance

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at pathai? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect