Skip to main content
Back to jobs

Senior HPC and AI Cluster Administrator

External
$118K–$195K/yrFull-timeOn-site1mo ago30+ days old, may be filled
AirflowApacheAWSAzureBashCompliance
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Design, Deploy, and maintain HPC/AI clusters
  • Manage AI jobs workflows using various scheduling technology, such as Kubernetes.
  • Support and maintain continuous integration and delivery pipelines
  • Troubleshooting and fixing, bottom up from bare metal, operating system, software stack and application level
  • Support Research, Development and Operational activities.

Requirements

  • Bachelor's Degree in Computer Science, Engineering, or a related field; or equivalent experience
  • 5 years of experience in any of the following:
  • Knowledge of HPC and AI solution technologies to include hardware, hypervisors, CPU's and GPU's.
  • Experience with job scheduling workloads and orchestration tools such as Slurm & K8s
  • Excellent knowledge of Linux (i.e. Redhat, Ubuntu) networking (Routing, Switching) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc.
  • Experience with multiple storage solutions such as Lustre, GPFS, zfs and xfs. Familiarity with newer and emerging storage technologies.
  • Automation and configuration management tools such as Python, Bash within a Gitops workflows.
  • Knowledge of Networking Protocols like InfiniBand, Ethernet
  • Experience with private cloud platforms (for example VMware, Hyper-V, KVM)
  • Familiarity with public cloud computing platforms (e.g. AWS, Azure)
  • Must possess and maintain required DoD 8140 certifications.
  • Ways to stand out from the crowd:
  • Knowledge of GPU architectures, time-slicing, Multi-instance GPU (MIG)
  • Experience with container orchestration technologies i.e. Kubernetes, Docker
  • Experience designing, deploying AI workflow technologies such as Apache Airflow, Prefect, Dagster.
  • Background with RDMA (InfiniBand or RoCE) fabrics
  • Experience working in regulated industries and applying compliance requirements (i.e. DISA STIG, CIS etc.)
  • NVIDIA Certifications (AI Infrastructure, AI Operations, AI networking)
  • VMWARE Certifications (Certified Professional / Advanced Professional)
  • Clearance
  • An active TS/SCI federal security clearance is required
  • The pay range for the states of California, Colorado, Hawaii, Illinois, Maryland, Massachusetts, Minnesota, New Jersey, New York, Washington, Vermont, the District of Columbia, and the city of Cleveland is:
  • $118,300 - $195,100 USD
  • What We Believe

Benefits

Health insurance

Additional Information

At Accenture Federal Services, nothing matters more than helping the US federal government make the nation stronger and safer and life better for people. Our 13,000+ people are united in a shared purpose to pursue the limitless potential of technology and ingenuity for clients across defense, national security, public safety, civilian, and military health organizations. Join Accenture Federal Services, a technology company within global Accenture. Recognized as a Glassdoor Top 100 Best Place to Work, we offer a collaborative and caring community where you feel like you belong and are empowered to grow, learn and thrive through hands-on experience, certifications, industry training and more. Join us to drive positive, lasting change that moves missions and the government forward! AFS is looking for a Senior HPC and AI Cluster Administrator to support software and data solutions for our customers. We are integrating supercomputers and AI clusters based on existing technologies. We are looking for a system administrator to be a key player to enable artificial intelligence and GPU computing solutions. You will work with many scientific researchers, developers, and customers to create improved workflows and develop unique solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at accenturefederalservices? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect