Infrastructure Engineer - Virualization

External

Tensorwave · Las Vegas, NV

Full-timeOn-site2w ago

AnsibleDocumentationIncident ResponseKubernetesLinuxMove

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

We are building and operating large-scale infrastructure platforms to support high-performance AI workloads across multiple data centers. Our environment includes GPU-intensive systems, high-throughput networking, and rapidly scaling compute clusters. We are looking for a Virtualization Operations Engineer to focus on the day-to-day operation, stability, and performance of our virtualization platforms. This role is responsible for ensuring that our hypervisor environments are reliable, performant, and scalable as we continue to grow. This is a hands-on operations role working across hypervisors, virtual machines, and underlying infrastructure systems.

Responsibilities

Operate and maintain large-scale virtualization environments (Proxmox and/or KVM-based systems)
Manage the full lifecycle of virtual machines: provisioning, configuration, migration, decommissioning
Monitor and respond to platform health issues, including host failures, VM performance degradation, resource contention (CPU, memory, disk, network)
Troubleshoot and resolve issues across hypervisors, guest operating systems, storage and networking layers
Execute infrastructure changes safely, including cluster expansions, host maintenance and upgrades, configuration updates
Work with automation tools to standardize deployments, reduce manual intervention, improve operational consistency
Collaborate with DevOps (automation and platform tooling), Network Engineering (connectivity and performance), Storage Engineering (I/O performance and reliability)
Participate in incident response and root cause analysis
Contribute to runbooks, documentation, and operational best practices

Requirements

Required Qualifications
4-7+ years of experience in infrastructure, systems, or platform operations
Hands-on experience operating Linux-based virtualization platforms , such as KVM/QEMU, Proxmox, VMware (with strong Linux fundamentals)
Strong Linux systems knowledge, including process management, networking, disk and filesystem management
Experience troubleshooting CPU and memory contention, disk I/O bottlenecks, network performance issues
Familiarity with virtualization concepts: VM lifecycle, resource allocation, live migration
Experience with infrastructure automation tools (e.g., Ansible or similar)
Ability to work effectively during incidents and production issues
Experience operating infrastructure at scale (100+ hosts)
Familiarity with GPU-based systems or high-performance workloads, NUMA awareness and performance tuning
Exposure to high-throughput networking (bonding, VLANs, SR-IOV), distributed or high-performance storage systems
Experience working alongside Kubernetes or container platforms
Experience in cloud or CSP environments

Benefits

Stock Options100% paid Medical, Dental, and Vision insurance for EmployeesCompany Health Savings Account Contributions100% paid Short Term and Long Term Disability Insurance for EmployeesLife and Voluntary Supplemental Insurance OptionsOther Insurance Options, such as Pet & Legal InsuranceVarious Supplementary Health Benefits, such as discounted Virtual Healthcare Appointments and Serious Illness SupportFlexible Spending Account401(k)Employee Assistance ProgramFlexible PTOPaid HolidaysParental LeaveOther In-Office PerksEqual Employment OpportunityTensorWave is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate on the basis of any protected status under applicable law.Reasonable AccommodationsTensorWave provides reasonable accommodations in accordance with applicable laws. If you require accommodation during the hiring process, please contact accomodations@tensorwave.com.Employment EligibilityAll offers of employment are contingent upon verification of identity and authorization to work in United States, as required by law.Background ChecksWhere permitted by law, employment may be contingent upon the successful completion of a job-related background check.Data Privacy NoticeBy submitting an application, you acknowledge that TensorWave may collect, use, and retain your personal information for recruiting and employment-related purposes in accordance with applicable data privacy laws.Health insuranceDental insuranceVision insurancePaid time offFlexible scheduleEquity / stock options

Additional Information

About TensorWave Our mission is simple: deliver seamless, secure, reliable, and resilient AI compute at scale. We've built a versatile cloud platform that eliminates infrastructure barriers, empowering builders to focus on innovation instead of fighting their stack. Because breakthrough AI should move at the speed of ideas, not infrastructure.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at tensorwave? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect