Apple Silicon GPU Driver Engineer, Graphics, Game and ML

External

Apple · Cupertino, CA

Full-timeOn-site4d ago

Load Balancing

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

The Apple Silicon GPU Driver Scheduler team is directly responsible for GPU workload management including scheduling of commands on the GPU, manage resources and dependencies, responsiveness and quality of service for applications using the GPU. The GPU Scheduler team directly impacts the performance and power efficiency of all Apple products using Apple Silicon GPU. We are looking for an engineer with a strong engineering background who is excited to work with engineers and other leaders at Apple to deliver Apple GPUs across all Apple devices, build and ship exciting new GPU focused features, work with other teams to prototype future HW and SW GPU features. In this role, you'll architect the GPU driver scheduling layer underneath Apple's largest server-side ML and LLM workloads. You'll design parallelism strategies that scale from a single GPU to clusters of nodes, build the synchronization and communication primitives that hold them together, and shape the HW/SW interfaces for next-generation GPU designs. You will be working at the intersection of cutting-edge ML systems, systems programming and hardware acceleration, partnering with world-class teams across Apple software and hardware organizations to co-design scheduling primitives in next-generation GPU, collaborate with framework and infrastructure teams to expose scheduling control where it matters, and contribute to the performance and reliability characteristics that ultimately determine inference latency and cost. We are seeking an individual with curiosity and passion to learn and innovate. The people here at Apple don't just create products - they create the kind of wonder that's revolutionized entire industries. It's the diversity of those people and their ideas that inspires the innovation that runs through everything we do, from amazing technology to industry-leading environmental efforts. Join Apple, and help us leave the world better than we found it.

Responsibilities

Design and implement low-level GPU driver and scheduler features optimized for ML/LLM workloads
Design, implement, and optimize scheduling strategies for efficient parallelism across one or more GPUs - data, model, and pipeline parallelism
Co-design scheduling primitives with hardware, performance-architecture, and software teams to achieve peak compute utilization and optimal memory throughput on next-generation GPU designs
Design and implement multi-GPU communication and synchronization using RDMA technologies, integrating with SoC, networking, and GPU front-end primitives, and influencing API/framework usage
Design and implement scalable ML serving infrastructure with first-class support for security, load balancing, and fault tolerance
Contribute to the design of APIs and abstractions that expose scheduling control to higher layers of the ML stack
Drive debug, performance analysis, and optimization for ML workloads - identifying bottlenecks in compute, memory, and distributed/network subsystems

Requirements

Experience with GPU Programming (CUDA/ROCm/Metal) and high-performance computing, successfully optimizing large-scale parallel workloads
Experience with inter-node communication technologies (InfiniBand, RDMA, NCCL) in the context of ML training/inference
Technical BS/MS degree or equivalent experience
Excellent systems programming knowledge with C or C++
Strong experience with operating systems and/or scheduling policies knowledge
Experience or deep understanding of distributed systems and parallel computing architectures
Understanding of systems architecture/compilers/algorithms
Excellent written and oral communication skills
Pay & Benefits

Additional Information

Apple's GGML team provides developers access to harness the power of the GPU across all of Apple's innovative products, from iPhone, iPad, Apple TV, Apple Watch to the Mac product line. Apple Silicon GPU Driver Scheduler team within Graphics, Games and ML group is seeking a senior/principal engineer to lead design of GPU scheduling mechanisms that drive peak utilization and orchestrate distributed inference across multi-node clusters for server-side ML acceleration - the compute infrastructure foundation that will deliver Apple Intelligence on Private Cloud Compute at unprecedented scale.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Apple? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect