Staff Machine Learning Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
The Apple Cloud AI Platform team enables Apple's next generation of intelligent products by giving Apple's ML engineers and researchers the data systems and large-scale compute they need to build and ship models at Apple's bar for quality and privacy.
Responsibilities
- As a member of the Apple Cloud AI Platform team, your responsibilities will include:
- Design and build the platform behind Apple's largest model builds - ingestion, immutable versioning, lineage, and governance across structured, unstructured, and multimodal data at petabyte scale, so every model run is reproducible from a versioned dataset
- Develop and evolve Python SDKs and core data libraries that ML engineers depend on to access, transform, and load model-ready datasets across every stage of model development
- Build high-throughput data access and loading primitives that feed Apple's largest GPU fleets, keeping workloads compute-bound rather than I/O-bound
- Build and operate distributed data pipelines spanning Spark, Daft, and Rust-based systems for ingestion, transformation, and large-scale data preparation
- Optimize platform components for tight integration with leading ML frameworks - PyTorch, JAX, and TensorFlow - so dataset access is a first-class concern in the model development loop
- Partner with research and product teams to onboard new data sources, and enable rapid iteration on datasets powering GenAI workloads
- Ensure governance is a first-class platform capability: Legal Terms of Use enforcement, privacy controls, and end-to-end data lineage on every dataset version
- Drive efficiency, reliability, and automation across the data plane and control plane that power Apple's ML fleet
- Continuously evolve platform capabilities to support next-generation workloads, including foundation models, multimodal data, and retrieval-augmented systems
- Diagnose, fix, and automate away complex issues across the stack - from ingestion pipelines to dataset APIs to ML framework integrations - to maximize uptime and throughput
Requirements
- Experience in any of the below is preferred:
- Proficiency with one or more modern ML frameworks (PyTorch, JAX, or TensorFlow), particularly the data loading and dataset access layer
- Columnar and lakehouse formats: Parquet, Iceberg, Delta, or Lance
- Distributed data loading frameworks for ML: Ray Data, NVIDIA DALI, WebDataset, or Mosaic StreamingDataset
- Performance engineering for I/O-bound workloads - Arrow, zero-copy, memory mapping, async I/O
- High-throughput object storage access patterns at GPU scale
- Data lineage and governance systems (DataHub, OpenLineage, Unity Catalog, or equivalent)
- Contributions to or operational experience with Spark, Daft, Polars, or DuckDB internals
- Containerization and orchestration technologies (Docker, Kubernetes)
- Strong foundation in machine learning, with hands-on experience across the end-to-end ML workflow - including data preparation, pipeline development, experimentation, evaluation, and deployment
- Expertise in building and running large scale distributed systems
- Familiarity with modern generative techniques (e.g. transformers, diffusion, retrieval-augmented generation)
- Proven experience building and delivering data and machine learning infrastructure in real-world production environments
- Familiarity with fine-tuning workflows, model optimization, and preparing models for scalable inference
- Familiarity with generative AI and its applications in accelerating and enhancing machine learning workflows
- Experience configuring, deploying and troubleshooting large scale production environments
- Experience in designing, building, and maintaining scalable, highly available systems that prioritize ease of use
- Exte
Additional Information
Join a team at the forefront of ML infrastructure and generative AI, where data and model workflows come together to enable the next generation of intelligent experiences on Apple products and services. We build robust systems that connect scalable data pipelines with advanced ML workflows, accelerating the development of real-world AI applications. Our work spans the full ML lifecycle, from experimentation to deployment, and you'll play a key role in shaping how AI models are built, optimized, and scaled. We develop a platform for ML data and features that powers advanced GenAI applications. This includes embeddings (generation, evaluation, ANN search, multimodal support), AI Ops, efficient inference, and a modern feature platform designed to streamline experimentation and drive innovation. We're looking for engineers and researchers passionate about generative models, data-centric ML, and intelligent systems across diverse real-world use cases. With the autonomy to experiment, the scale to make an impact, and the support to take ideas from prototype to production, you'll work alongside a world-class team to build intelligent, flexible systems that make ML development faster, more reliable, and more creative.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Apple? Share your experience