L2 Datacenter Support Engineer

External

Mirantis · Poznań, Poland

Full-timeOn-site1w ago

AnsibleAWSGrafanaIncident ResponseKubernetesLinux

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Troubleshoot and maintain InfiniBand fabrics, including performance tuning, link issues, and topology validation.
Act as the escalation point for L1 for complex infrastructure and hardware issues.
Own and maintain accurate infrastructure modeling, IPAM, and source-of-truth data in NetBox.
Own InfiniBand fabric management and advanced troubleshooting, utilizing Verity for configuration, monitoring, and optimization of high-performance interconnects.
Diagnose and resolve issues across GPU servers, networking, storage, and Kubernetes platforms.
Perform deep hardware and system-level diagnostics (GPUs, PCIe, NICs, firmware, etc.).
Support Kubernetes platform stability (node health, networking, scheduling issues).
Contribute to automation of provisioning and operational workflows.
Lead incident response, root cause analysis (RCA), and post-incident improvements.
Collaborate with vendors and internal engineering teams on complex issues.
Support infrastructure upgrades, firmware management, and capacity expansion.
Required Skills & Experience:
3-6+ years of experience in infrastructure operations, datacenter engineering, or cloud platforms.
Strong Linux systems expertise.
Hands-on experience with bare metal provisioning systems and lifecycle management.
Strong experience with InfiniBand networking (troubleshooting, performance, fabric management using UFM).
Experience with IPAM/DCIM tools such as NetBox and Ethernet network configuration and validation leveraging Verity.
Solid understanding of datacenter networking, storage, and hardware architecture.
Working knowledge of Kubernetes in production environments.
Strong troubleshooting skills across hardware and distributed systems.

Requirements

Experience with NVIDIA GPU platforms and accelerated computing infrastructure.
Familiarity with automation tools (Terraform, Ansible, etc.).
Exposure to OpenStack (optional).
Experience with observability stacks (Prometheus, Grafana, ELK).
Success in this role:
Rapid resolution of complex infrastructure and networking issues.
High reliability and performance of InfiniBand and GPU infrastructure.
Scalable and efficient bare metal provisioning processes.
Strong contribution to automation and operational excellence.
Trusted escalation point and technical leader within the team.

Benefits

Work with an established Silicon Valley leader in the cloud infrastructure industry;Work with exceptionally passionate, talented and engaging colleagues, helping Fortune 500 and Global 2000 customers implement next-generation cloud technologies;Be a part of cutting-edge, open-source innovation;Thrive in the high-energy environment of a young company where openness, collaboration, risk-taking, and continuous growth are valued;Professional development and training;Attend conferences and working groups;Company outings, happy hours, hackathons, and tech talks;Receive a competitive compensation package with a strong benefits plan.We are a Leader for Container Management in G2 (#2 after AWS)!Health insuranceVision insurance

Additional Information

We are looking for an experienced L2 Engineer to operate and support high-performance AI infrastructure platforms, including NVIDIA GPU clusters, InfiniBand fabrics, and Kubernetes-based IaaS environments. This role focuses on deep infrastructure expertise, ensuring performance, scalability, and reliability of the platform layer that powers AI workloads - without being responsible for the workloads themselves. You will play a key role in bare metal lifecycle management, advanced InfiniBand troubleshooting, and platform stability, working closely with engineering teams to operate cutting-edge infrastructure at scale.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Mirantis? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect