Senior Engineer, Inference Control Plane

External

Digitalocean98 · Seattle

$139K–$174K/yrFull-timeOn-site1w ago

Capacity PlanningDigitalOceanKubernetesMicroservicesObservabilityRouting

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Design and build scalable, multi-tenant services that power AI inference and intelligent routing workloads.
Develop and operate high-scale distributed systems with strong reliability, availability, and performance goals.
Strengthen platform resiliency through improved observability, capacity management, automation, and operational tooling.
Partner closely with platform, GPU infrastructure, and product engineering teams to deliver production-grade systems and highly available APIs.
Raise the engineering bar through strong software design, operational discipline, incident management, and continuous improvement practices.
Contribute to architecture decisions around traffic management, service orchestration, reliability, and platform scalability.
Participate in on-call rotations and lead efforts to reduce operator pain, improve service health, and prevent recurring incidents.

Requirements

Required
5+ years of experience building and operating multi-tenant platforms or distributed backend systems
Strong experience operating high-scale distributed services in production environments
Deep understanding of SRE principles, including observability, incident management, reliability engineering, capacity planning, and operational automation
1+ years of hands-on experience with Go / Golang in production systems
1+ years of experience with Kubernetes
Strong understanding of cloud-native architectures, microservices, and distributed systems fundamentals
Experience debugging performance, scalability, and reliability issues in production systems
Observability Proficiency: Experience tracking infrastructure and inference metrics like Time To First Token (TTFT), Time Per Output Token (TPOT), and GPU utilization.
Bonus
AI/ML Framework Knowledge: Understanding of modern LLM serving architectures and familiarity with engines like vLLM or Triton.
Experience with API gateways, traffic routing, or service mesh technologies
Familiarity with LLM serving stacks such as vLLM, TensorRT-LLM, or similar technologies
Experience building systems for inference optimization, rate limiting, routing, or workload orchestration
Compensation Range:
$139,000 - $174,000
*This is a hybrid role
JR: 2026-7622
#LI-Hybrid
Why You'll Like Working for DigitalOcean
DigitalOcean is an equal-opportunity employer. We do not discriminate on the

Benefits

Health insurancePaid time offFlexible scheduleEquity / stock optionsPerformance bonus

Additional Information

Dive in and do the best work of your career at DigitalOcean. Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud. If you have a growth mindset, naturally like to think big and bold, and are energized by the fast-paced environment of a true industry disruptor, you'll find your place here. We value winning together-while learning, having fun, and making a profound difference for the dreamers and builders in the world. We are seeking a Senior Engineer to implement and contribute to the design and optimization of our Serverless Inference infrastructure and APIs. In this role, you will tackle the challenges of large-scale AI workloads, focusing on throughput, GPU utilization, and fault tolerance to support next-generation inference needs of AI native enterprises.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at digitalocean98? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect