Skip to main content
Back to jobs

Cluster Deployment Engineer

External
Anthropic logoAnthropic · Remote
Full-timeRemote2mo ago30+ days old, may be filled
Capacity PlanningSAFe
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

As a Cluster Deployment Engineer, you will own how large-scale AI compute clusters physically come together inside our datacenter fleet. You will set the deployment-engineering strategy for cluster build-out - how racks are organized into pods, halls, and sites; how compute, network, power, and cooling systems interface at the rack boundary; and how deployment scope flows cleanly from hardware specification to facility delivery to a running cluster. This role is focused on deployment engineering, not on datacenter network or systems design - your scope is making sure clusters land cleanly and predictably, not designing the fabrics or facilities themselves. This is a senior individual-contributor role with broad technical influence. You will work across hardware, networking, facilities, supply chain, and construction to ensure that every generation of accelerator we deploy lands in a datacenter that is ready for it - on schedule, at full density, and with every piece of required infrastructure accounted for. You will be the person who sees around corners: anticipating how next-generation rack designs will stress our facilities, where our deployment model will break at scale, and what needs to change now so that the next cluster turn-up is faster and more predictable than the last. You will operate at the intersection of engineering strategy and execution discipline, partnering with internal research and systems teams, external developers, engineering firms, and OEM partners to deliver cluster capacity at the speed the frontier demands.

Responsibilities

  • Own cluster-level deployment strategy - define how AI compute clusters are organized across the floor, how racks interconnect, and how cluster topology requirements translate into facility and deployment scope across a portfolio of sites.
  • Set rack interface standards spanning power, network, mechanical, thermal, and spatial domains, and ensure that every deployment includes the complete set of infrastructure required to bring a cluster online.
  • Drive multi-threaded cluster bring-up programs across hardware, networking, power, and cooling - owning plans, dependencies, and critical paths from hardware specification through energization and turn-up.
  • Partner with internal engineering teams - research, systems, networking, and hardware - to translate cluster requirements into deployable facility scope, and to derisk onboarding of new hardware platforms well ahead of delivery.
  • Lead external partner execution with developers, engineering firms, OEMs, and construction teams, driving technical reviews, deviation management, and handoffs that keep deployments on schedule and within specification.
  • Improve cluster turn-up reliability and repeatability - identify systemic gaps in deployment scope, tooling, and partner interfaces, and drive durable fixes that reduce time-to-serve for new capacity.
  • Define and track deployment KPIs - cluster readiness, schedule adherence, scope completeness, time-to-first-packet - and use historical trends to forecast risk and inform capacity planning.
  • Coordinate cross-functional readiness across supply chain, security, operations, and construction to ship production-ready compute capacity.
  • Provide crisp executive visibility on deployment progress, tradeoffs, and risks across a portfolio of concurrent cluster programs.
  • Design cluster interfaces for durability - define rack and cluster-level interfaces that remain robust across hardware generations, so that facility scope and deployment models do not need to be reinvented every time the underlying hardware changes.
  • You may be a good fit if you
  • Have 10+ years of experience in hyperscale datacenter environments, with senior-level responsibility for cluster deployment, large-scale IT integration, or equivalent infrastructure programs.
  • Have delivered AI, HPC, or high-density compute clusters at scale and developed a strong intuition for the constraints that govern cluster deployment - interconnect reach, adjacency, power density, and thermal limits.
  • Can operate fluently across the boundary between IT hardware and facility infrastructure, and have set interface standards that held up across multiple hardware generations and sites.
  • Have led cross-functional programs with both internal engineering teams and external developers, engineerin

Additional Information

About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Anthropic? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect