Skip to main content
Back to jobs

Mgr, Engineering Program Management, AI Platforms & Infrastructure

External
Apple logoApple · Santa Clara, CA
Full-timeOn-site2w ago
ForecastingGenerative AILeadershipMachine LearningPyTorchStakeholder Management
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

We are looking for an experienced Engineering Program Manager (EPM) Manager to lead strategy, execution, and delivery across our AI/ML platform and infrastructure programs. In this role, you will drive cross-functional initiatives spanning Apple's massive-scale GPU/TPU compute infrastructure, Foundation Model inference platforms, and hybrid-cloud AI systems. You will partner closely with engineering and operations leaders to translate complex technical requirements into actionable roadmaps. Crucially, you will be responsible for growing and scaling a high-performing EPM team to meet the rapidly expanding demands of Apple's generative AI and machine learning platforms.

Responsibilities

  • Build, scale, and mentor a high-performing team of Engineering Program Managers, fostering a culture of ownership, accountability, and execution rigor during a period of significant organizational growth.
  • Lead strategy, roadmap planning, and end-to-end execution for large-scale AI/ML infrastructure programs, heavily focused on Foundation Model inference and training platforms.
  • Drive cross-functional alignment across engineering, product, and operations teams to deliver scalable, low-latency compute infrastructure utilizing massive GPU and TPU clusters.
  • Serve as the strategic engineering interface with tier-1 third-party cloud vendors-negotiating upfront technical constraints, capacity plans, and SLAs-while partnering seamlessly with Operations teams to ensure vendor capabilities meet our Foundation Model roadmaps.
  • Spearhead cost efficiency and operational excellence programs through smarter resource allocation, compute capacity forecasting, and global workload scheduling.
  • Collaborate with partner Operations teams to align engineering roadmaps with infrastructure execution, covering capacity forecasting, performance tuning, and disaster recovery in multi-region, hybrid cloud environments.
  • Define qualification and rollout plans for new infrastructure build-outs, ensuring reliability and performance benchmarks are met before production deployment.
  • Partner with engineering leadership to translate product requirements into long-term infrastructure strategies, optimizing for efficiency and global scale.

Requirements

  • Deep technical background in AI/ML infrastructure, cloud operations, or distributed compute platforms, with direct experience in GPU/TPU capacity management and provisioning.
  • Familiarity with large-scale distributed training frameworks (e.g., PyTorch, Megatron-LM, JAX) and their infrastructure implications at scale.
  • Familiarity with FinOps practices in large-scale GPU/TPU environments.
  • Experience navigating large-scale organizational change and team restructuring.
  • 10+ years of experience in product or program management, with at least 3+ years in a people management or lead EPM role.
  • Proven experience building and scaling teams, with the organizational savvy to expand team scope and influence across a highly matrixed environment.
  • Extensive experience managing strategic relationships with top-tier cloud vendors and external partners, including infrastructure planning, contract alignment, and SLA enforcement.
  • Strong strategic thinking with the ability to balance long-term platform roadmap priorities against near-term inference and training execution demands.
  • Track record of delivering massive-scale cost optimization and operational efficiency programs in hybrid-cloud environments.
  • Excellent communication and stakeholder management skills - able to translate complex technical infrastructure concepts for both deep engineering teams and executive audiences.
  • Experience in multi-tenant, high-performance compute environments running large-scale Foundation Models or similar ML workloads.
  • BS/MS in EE/CS/CE or equivalent
  • Pay & Benefits
  • Apple employees also have the opportunity to become an Apple

Additional Information

Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Do you love taking on challenges that create a positive impact? Are you passionate about empowering many ground-breaking intelligent experiences to be made? The Apple Services Engineering org is building groundbreaking technology... and we are looking for people like you! Apple offers a collaborative work environment that fosters creativity and innovation. Every new product, service, or feature we invent is the result of people working together to make each other's ideas stronger. That happens here because every one of us strives toward a common goal - crafting the best customer experiences.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Apple? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect