Principal Systems Software Engineer, LPU
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Shape the architecture of the hardware abstraction layers and core system libraries, and own the API contracts for the components you lead.
- Design and implement drivers, runtimes, and data movement and aggregation pipelines that execute workloads on novel silicon.
- Build runtime interfaces for launching, monitoring, and managing workloads at production scale.
- Drive triage of the most difficult sequencing, initialization, and cross-component runtime failures, and produce root-cause analyses that change how the system is built.
- Lead new platform bring-up and NPI for new boards and silicon, in tight partnership with hardware engineering, compiler teams, and data center operations.
- Multiply the team - establish the agent-assisted engineering practices, reusable abstractions, diagnostics, and documentation that let everyone move faster without destabilizing the platform.
- Communicate architecture and design tradeoffs clearly, in writing and in diagrams, to audiences ranging from individual engineers to executive staff.
- What we need to see:
- MS in CS, CE, EE, or a related STEM field, or equivalent experience, and 12+ years building production system software.
- A track record of designing and evolving libraries and APIs meant to be supported for years, including ABI and compatibility discipline.
- Fluency in large, multi-repository codebases with layered dependencies.
- Demonstrated leadership driving triage of difficult reliability issues to clear, written root-cause analysis.
- Low-level platform experience: firmware and boot flows, RTOS, BMCs/MCUs, RISC-V, or closely related system software.
- Linux driver or kernel-adjacent experience (for example, VFIO or similar subsystems).
- Hardware bring-up and system triage experience: fault analysis, diagnostics, and validation in lab environments.
- An established habit: building with AI coding agents - not as a novelty, but as a way you already ship and raise leverage. You can speak to how you design work to be agent-amenable and where you keep humans in the loop.
- Ways to stand out from the crowd:
- Experience having built Rust system software at the scale of a hyperscaler or a Rust-native hardware company - the kind of environment where Rust is the production language for low-level work, not an experiment.
- Distributed systems experience: gRPC and RPC frameworks, coordination and telemetry patterns, MPI. Inference systems and token serving experience (vLLM or similar serving and runtime stacks) a huge plus.
- Experience shipping and supporting customer-facing SDKs, including documentation and ABI compatibility practices.
- Production readiness and delivery depth: CI/CD and release workflows, monitoring and alerting practices, Kubernetes, and data center operational workflows.
- Widely considered to be one of the technology world's most desirable employers, NVIDIA has some of the most forward-thinking and hardworking people in the world inventing the future with us. Are you a creative and collaborative softwa
Additional Information
We are now looking for a Principal Software Engineer for LPX System Software! NVIDIA's LPX System Software team builds the foundational software that turns a novel deterministic compute architecture into a platform that compiler teams and data center operators can rely on. We shift complexity out of silicon and into software: the hardware abstraction layers, core system libraries, drivers, and runtime components that workloads enter the platform through. We build this stack in Rust. For system software living at the boundary between hardware and everything above it, we treat memory safety, explicit ownership, and long-lived API stability as the baseline rather than the goal - the foundation that lets us spend our judgment on the hard problems instead of on classes of bugs that should not exist. As one of the principal engineers on this stack, set technical direction for the surfaces you own and shape the overall architecture alongside your fellow principals. Design the HAL, runtime interfaces, and data-movement pipelines the rest of the platform depends on; drive the hardest reliability and bring-up problems to root cause; and raise the throughput of the whole org by codifying the abstractions, patterns, and tooling that others build on. You will also help define how we engineer. We treat AI coding agents as a primary part of the workflow, and we expect our most senior engineers to be fluent in directing them - designing systems that are legible to both humans and agents, and turning hard-won judgment into leverage across the team.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at NVIDIA? Share your experience