Skip to main content
Back to jobs

Systems Engineer - Evaluation Engineering

External
Apple logoApple · Cupertino, CA
Full-timeOn-siteToday
API DesignArgoCDAWSAzureCI/CDFastAPI
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

We are looking for a Distributed Systems Engineer to own the infrastructure powering our core Siri Agentic Evaluation Platform. Evaluation is no longer just a static test suite-it is a highly dynamic, massive-scale distributed problem. Our platform enables teams to run high-throughput agentic simulations, orchestrate multi-model judging pipelines, and generate real-time observability dashboards across billions of tokens and complex data types. In this role, you will design the execution engine that coordinates these complex evaluation loops. You will build systems that remain deterministic, fault-tolerant, and cost-efficient, even when coordinating massive parallel requests across heterogeneous device types(iPhones, Mac, iPads etc).

Responsibilities

  • Distributed Execution Engine: Architect and scale the core asynchronous engine responsible for orchestrating thousands of parallel agent simulations, validation tests, and LLM-as-a-judge pipelines.
  • Internal Developer Platform (IDP): Design and build self-service infrastructure, CLI tools, and internal APIs that allow ML and product teams to easily integrate evaluation pipelines into their CI/CD workflows.
  • Backend API & Service Architecture: Design, build, and maintain highly performant, type-safe APIs (gRPC/REST) capable of serving complex evaluation pipelinee, trace data, and real-time generation metrics.
  • Stream Processing & Lineage: Build robust data pipelines to ingest and transform high-volume execution traces. Ensure immutable data lineage so that every evaluation metric can be perfectly traced back to its raw generation for granular error attribution.
  • Infrastructure-as-Code & GitOps: Own the deployment topologies of the evaluation platform across multi-tenant clusters using declarative infrastructure and continuous delivery practices.
  • Reliability, Observability & Guardrails: Implement deep observability (distributed tracing, structured metrics, and alerting) across the platform. Design smart scheduling layers, token buckets, and circuit breakers to prevent downstream API rate-limiting or cascading cluster failures.

Requirements

  • Relational & Analytical Databases: Strong experience modeling complex relational data and trace hierarchies using PostgreSQL, combined with high-throughput analytical query layers.
  • Orchestration & Messaging: Experience designing asynchronous, event-driven architectures using Kafka, AWS SQS/SNS, RabbitMQ, or Redis Streams.
  • Cloud Native Architecture: Advanced experience with Kubernetes (orchestration, custom operators, service meshes like Istio or Linkerd) and cloud providers (AWS, GCP, or Azure).
  • Experience building Agentic RAG platforms or developer-facing infrastructure tooling.
  • Infrastructure-as-Code (IaC): proficiency with Terraform to manage infrastructure declaratively.
  • CI/CD & DevEx: Experience building automated, containerized deployment pipelines (GitHub or ArgoCD) with an emphasis on keeping developer feedback loops fast and reliable.
  • MS in computer science or equivalent
  • 7+ years of experience as distributed systems engineer, platform engineer or equivalent
  • Systems & Backend Languages: Strong proficiency in languages optimized for concurrency and enterprise scale, such as Python (asyncio) or Java
  • API Design & Ecosystems: Deep expertise in designing robust, versioned production APIs using gRPC/Protobuf, GraphQL, or REST (FastAPI)
  • Pay & Benefits

Additional Information

Join the team redefining what a deeply personal and integrated assistant can be. As part of the Siri organization, you will help shape one of the world's most widely used AI assistants, powered by our next-generation of Apple Intelligence, with capabilities like personal context understanding and on-screen awareness, built with privacy from the ground up. Your work will have direct, meaningful impact for users across iOS, iPadOS, macOS, watchOS, and visionOS. This is a rare opportunity to build at the intersection of cutting-edge AI and human-centered design, shipping technology that is centered around users and their needs.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Apple? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect