Skip to main content
Back to jobs

HPC Engineer

External
Sandisk logoSandisk · Bengaluru, India
Full-timeOn-site1w ago
AnsibleAWSAzureBashCapacity PlanningDocumentation
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Architect, deploy, and manage large-scale distributed HPC environments across global locations, supporting ASIC and GPU compute clusters
  • Design and implement infrastructure automation using Ansible, Shell, and Python for system lifecycle management
  • Administer and optimize workload schedulers ( LSF, Slurm, NC ) including queue configuration, fair-share policies, and job prioritization
  • Perform deep troubleshooting and root cause analysis across compute, storage, networking, and scheduler layers
  • Collaborate with engineering teams to improve EDA workload performance and efficiency in global HPC environments
  • Develop and deploy self-service automation solutions to reduce manual effort and improve system reliability
  • Manage and support EDA ecosystem including tool deployment (Cadence, Synopsys), licensing, and workflow optimization
  • Implement monitoring & observability frameworks using tools like Splunk, Grafana for proactive issue detection
  • Drive capacity planning, performance tuning, and resource optimization for HPC workloads
  • Create and maintain technical documentation, runbooks, and operational standards
  • Provide technical leadership and mentoring , influencing HPC architecture and long-term strategy
  • Techncal Skills
  • HPC & Scheduling: LSF, Slurm, Network Computer (NC), Grid/Batch scheduling
  • Operating Systems: RedHat Enterprise Linux (RHEL), CentOS
  • Automation & Scripting: Ansible, Shell/Bash, Python
  • EDA Tools: Cadence, Synopsys, EDA workflows & design environments
  • Monitoring & Observability: Splunk, Grafana, Prometheus
  • Storage & Filesystems: NFS, AutoFS, distributed storage systems
  • Authentication & Access: UNIX/Linux integrated with Active Directory
  • Infrastructure: On-premises & Hybrid HPC environments
  • Remote Access & VDI: Exceed TurboX, VNC, nomachine
  • Preferred Skills
  • Extensive experience with job schedulers such as LSF, Slurm, or equivalent platforms
  • Experience supporting EDA / semiconductor design environments
  • Exposure to GPU computing and accelerator-based workloads
  • Knowledge of EDA licensing systems and optimization
  • Experience with Infrastructure as Code (IaC) and platform standardization
  • Familiarity with cloud or hybrid HPC architectures (AWS/Azure HPC)
  • Bachelor's degree in Computer Science, Engineering, or equivalent experience
  • 8+ years of experience in Linux system administration (RHEL/CentOS)
  • Strong expertise in HPC cluster management and workload schedulers (LSF/Slurm)
  • Proven experience in automation and scripting (Ansible, Shell, Python and AI integration)
  • Hands-on experience managing large-scale HPC or EDA environments
  • Strong skills in performance tuning, capacity planning, and workload optimization
  • Excellent troubleshooting and problem-solving skills in complex production environments
  • Ability to lead projects end-to-end and work with cross-functional teams

Benefits

Remote work options

Additional Information

Role Overview Experienced Senior HPC Engineer / Architect specializing in Linux-based high-performance computing (HPC) environments , EDA workflows , and automation-driven infrastructure . Proven expertise in designing, managing, and optimising large-scale distributed HPC clusters supporting ASIC EDA workloads .


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Sandisk? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect