Skip to main content
Back to jobs

Principal Product Manager, Inference Engine

External
digitalocean98 logoDigitalocean98 · Seattle
Full-timeOn-siteToday
CachingCapacity PlanningComplianceDigitalOceanObservabilityPerformance Optimization
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Own the GPU strategy for the inference business: Define how DigitalOcean should deploy, allocate, price, and optimize GPU capacity across serverless inference, dedicated inference, batch workloads, and future inference offerings.
  • Maximize GPU utilization and margin: Create a clear product and business framework for improving token revenue per GPU hour, reducing idle capacity, reclaiming underutilized infrastructure, and driving better gross margin as the business scales.
  • Define the inference product roadmap: Partner with engineering to prioritize capabilities such as prompt caching, autoscaling, batching, latency optimization, observability, dedicated deployments, compliance features, and media model support.
  • Balance developer experience with infrastructure economics: Build products that are simple for developers to use while making rigorous tradeoffs around latency, availability, throughput, pricing, and cost-to-serve.
  • Create pricing and packaging strategy: Work with finance, GTM, and engineering to define SKUs, pricing models, discounting frameworks, and packaging for serverless, dedicated, and enterprise inference customers.
  • Drive customer-backed product decisions: Work directly with AI-native startups, mid-market customers, and strategic accounts to understand model needs, performance requirements, compliance expectations, and deployment patterns.
  • Partner deeply with engineering and infrastructure teams: Translate customer demand and business goals into infrastructure requirements across GPU fleet planning, model serving, capacity allocation, performance optimization, and reliability.
  • Establish operating metrics for the business: Define and track the metrics that matter, including GPU utilization, token throughput, revenue per GPU hour, latency, error rates, model adoption, margin, customer retention, and capacity efficiency.

Requirements

  • Deep product judgment in infrastructure or AI: Experience building infrastructure, developer platforms, ML platforms, inference systems, cloud services, or highly technical products for developers and enterprises.
  • Strong understanding of GPU economics: Ability to reason about utilization, throughput, latency, CapEx, cost-to-serve, gross margin, capacity planning, and workload placement.
  • Fluency in modern AI workloads: Familiarity with LLM inference, open-source models, model serving, prompt caching, batching, model routing, media models, latency tradeoffs, and production AI application patterns.
  • Technical depth with business orientation: You can work credibly with infrastructure engineers while also making clear product and business tradeoffs for executives, GTM teams, and customers.
  • Strong analytical rigor: You are comfortable building frameworks, models, and decision systems that turn ambiguous infrastructure and customer signals into clear product direction.
  • Customer obsession: You work backwards from developers and AI-native companies, but you also understand that great infrastructure products must be reliable, performant, simple, and economically sustainable.
  • Executive communication: You can explain complex technical and business decisions clearly to senior leaders, cu

Benefits

Paid time off

Additional Information

Dive in and do the best work of your career at DigitalOcean. Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud. If you have a growth mindset, naturally like to think big and bold, and are energized by the fast-paced environment of a true industry disruptor, you'll find your place here. We value winning together-while learning, having fun, and making a profound difference for the dreamers and builders in the world. Are you passionate about building the infrastructure that will power the next generation of AI applications? Are you ready to own the GPU strategy behind one of DigitalOcean's fastest-growing product categories? DigitalOcean is entering a pivotal moment as we build infrastructure for AI-native companies and the future 100 million developers. Inference is becoming one of the most important layers of the AI stack. Developers need access to the right models, on the right GPUs, with the right latency, pricing, reliability, and scale. At the same time, cloud providers must manage scarce GPU capacity with discipline: maximizing utilization, improving gross margin, selecting the right model mix, and ensuring customers can trust the platform for production workloads. We are looking for a Principal Product Manager for Inference Engine to define and own the product strategy for DigitalOcean's inference business. This product leader will be responsible for shaping our GPU strategy, pricing and packaging, utilization framework, and roadmap for serving developers and AI-native companies with high-performance inference at scale. This is a rare opportunity to help define a high-growth infrastructure business where product strategy, technical judgment, and business economics must come together.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at digitalocean98? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect