Skip to main content
Back to jobs

Senior Datacenter Technical Program Manager, At-Scale AI Clusters

External
NVIDIA logoNvidia · Santa Clara, CA
Full-timeOn-site4d ago
Express
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Collaborate with outstanding engineers and architects to build and deploy large scale GPU computing systems based on NVIDIA's reference supercomputing architectures
  • Lead the integration of new AI clusters with datacenter facilities with demanding requirements on power, cooling, and instrumentation
  • Coordinate design and fit-out of new datacenter builds, working with both internal engineering teams and external contractors
  • Own and produce detailed documentation for the end-to-end process for datacenter fit-out and integration
  • Communicate internally with engineering leadership to prioritize and address key issues essential to the success of our largest customers
  • What we need to see:
  • BS in Applied Science or Engineering (or equivalent experience)
  • 8+ years of overall experience
  • Experience with high-performance computing systems and GPU clusters deployed in on-premises datacenters
  • A passion for understanding challenging technical problems and driving the process of finding a solution
  • Strong teamwork and interpersonal skills, to facilitate building a collaborative workflow for coordination between many teams
  • Ways to stand out from the crowd:
  • Understanding of datacenter design, including familiarity with power and cooling technologies
  • Expertise in system monitoring and instrumentation of large clusters, using technologies such as Prometheus, Grafana, Splunk, Modbus, and BACNet
  • Experience working with the engineering or academic research community supporting high-performance computing or deep learning
  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 168,000 USD - 258,750 USD for Level 4, and 200,000 USD - 322,000 USD for Level 5. You will also be eligible for equity and benefits .
  • Applications for this job will be accepted at least until June 12, 2026. This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.

Additional Information

NVIDIA is looking for a highly-motivated Technical Program Manager (TPM) to join our Applied Systems Engineering Team to drive datacenter integration for the next generation of NVIDIA AI supercomputing systems. This TPM will play a crucial role throughout the lifecycle of the latest AI systems at scale, from datacenter design and requirements definition, through systems integration of AI clusters into the datacenter environment, and support for these systems as they enter production. This role will drive collaboration between engineering leaders across multiple hardware and software teams, helping us work together to build AI supercomputers for NVIDIA engineers and develop reference architectures to advise customers and partners.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at NVIDIA? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect