Skip to main content
Back to jobs

Infrastructure Engineer (OpenStack Ironic Specialist)

External
Full-timeOn-site2w ago
ComplianceIncident ResponseLinux
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Benefits

Vision insurance

Additional Information

About Nscale Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility. At Nscale, our Engineering team plays a critical role in driving the deployment and then subsequent management of our infrastructure and software platforms.. We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you'll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you'll be contributing to building the technology that powers the future. About the Role (Job Purpose) The Infrastructure Engineer (Ironic Specialist) sits within the Infrastructure Engineering team. The Infrastructure Engineering team is responsible for the design, implementation, operation, and continuous improvement of the infrastructure stack that underpins all internal and customer-facing services. This specialist role is focused on OpenStack bare metal provisioning and lifecycle management, with particular emphasis on Ironic and the services, workflows, and integrations required to operate large-scale physical infrastructure reliably. The role is critical to the successful delivery of automated provisioning, hardware onboarding, lifecycle operations, and hardware fault management across our cloud estate. The role also acts as a key link into the upstream OpenStack community, helping ensure that Nscale both benefits from and contributes to the continued development of Ironic and the wider bare metal ecosystem. This team ensures high levels of availability, scalability, automation, and security for the infrastructure layers they own. This team acts as a 3rd/4th line escalation point for support organisations, as well as providing subject matter expertise to pre-sales and other groups within the organisation. What You'll be Doing (Responsibilities) Designing, implementing, and operating scalable and resilient bare metal provisioning platforms with a strong focus on OpenStack Ironic. Owning the lifecycle of physical infrastructure through automated discovery, enrolment, provisioning, cleaning, deprovisioning, and hardware state management. Managing and improving integrations between Ironic and related OpenStack services such as Nova, Neutron, Glance, Keystone, Placement, and supporting automation tooling. Building and maintaining robust provisioning workflows for a wide range of hardware profiles, including GPU-enabled and high-performance server platforms. Driving automation for hardware onboarding, firmware and BIOS configuration, deployment workflows, validation, and recovery using infrastructure-as-code and configuration management tools. Troubleshooting complex issues across provisioning pipelines, PXE/iPXE, BMC interfaces, out-of-band management, image deployment, network boot, and hardware compatibility. Acting as a 3rd/4th line escalation point for advanced bare metal and provisioning incidents, carrying out root cause analysis and implementing long-term fixes. Supporting platform upgrades, lifecycle management, and operational improvements across Ironic and its dependencies. Collaborating closely with network, compute, data centre, and support teams to ensure efficient and reliable delivery of physical infrastructure services. Contributing specialist input to infrastructure roadmap planning, capacity expansion, standard builds, and hardware platform qualification. Supporting pre-sales and solution design efforts by providing expert guidance on bare metal capabilities, operational models, and deployment constraints. Contributing to upstream OpenStack bare metal communities, particularly Ironic and related projects, through bug reports, code contributions, testing, reviews, and design discussions where appropriate. Tracking upstream roadmaps, release changes, and community direction to help shape Nscale's bare metal strategy, upgrade planning, and platform standards. Representing Nscale's operational requirements, hardware use cases, and scaling challenges in upstream discussions to help drive practical improvements for both the business and the wider community. Ensuring provisioning platforms and operational processes adhere to security, compliance, and operational standards. Participating in on-call rotations and incident response activities for critical infrastructure services. About You (Skills / Qualifications Experience) Strong Linux systems administration and troubleshooting experience. Deep hands-on experience deploying, operating, upgrading, and troubleshoot


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at nscaleoperationsukltd? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect