Skip to main content
Back to jobs

Senior Failure Analysis Engineer

External
NVIDIA logoNvidia · Santa Clara, CA
Full-timeOn-siteToday
PythonRustCI/CDMachine Learning
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Own the reliability, performance, and continuous improvement of production-critical systems, including databases, CAD navigation tools, and failure analysis platforms, ensuring high availability and responsiveness for semiconductor engineering and manufacturing teams.
  • Design and deliver scalable automation frameworks, data pipelines, and intelligent workflows that streamline semiconductor engineering, failure analysis, and production support processes at scale.
  • Build advanced analytics platforms, dashboards, and orchestration systems that turn engineering and production data into clear, actionable insight for faster debug and better decision-making.
  • Apply AI, machine learning, and optimization techniques to reduce manual effort, accelerate root-cause analysis, and strengthen both engineering and production workflows.
  • Partner closely with failure analysis, design, verification, CAD, infrastructure, and production collaborators to deliver reliable, maintainable, and high-impact technical solutions.
  • Drive continuous improvement in software quality, usability, performance, and operational excellence across large-scale compute, data, and production environments.
  • What We Need to See:
  • BS or MS in Electrical Engineering, Computer Engineering, Computer Science, or a related technical field, or equivalent experience.
  • 8+ years of professional experience in software engineering, electrical engineering, or semiconductor development/production environments.
  • Strong proficiency in Python, Rust, Shell scripting, or similar languages for building robust automation, tooling, and production systems.
  • Proven track record designing automation frameworks, data-processing systems, or productivity tools with measurable engineering or production impact.
  • Solid experience in Linux environments and modern software engineering guidelines (version control, testing, CI/CD, observability).
  • Exceptional analytical and problem-solving skills with success navigating complex, multidisciplinary technical and production challenges.
  • Strong collaboration and communication skills with proven efficiency across multi-functional engineering and production teams.
  • Way to stand out from the crowd:
  • Direct experience in semiconductor design, silicon development, failure analysis, yield engineering, or engineering automation and production support workflows.
  • Hands-on application of AI/ML, data analytics, or optimization methods to technical, hardware, or production-related problems.
  • Familiarity with EDA workflows, design infrastructure, CAD navigation systems, or semiconductor tooling and lab/production environments.
  • Track record architecting and operating scalable data pipelines, analytics platforms, or workflow orchestration systems in production settings.
  • Proven ability to independently scope, drive, and deliver technical projects end-to-end while balancing development and production support responsibilities in fast-paced environments.
  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 144,000 USD - 230,000 USD. You will also be eligible for equity and benefits .
  • Applications for this job will be accepted at least until June 16, 2026. This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.

Additional Information

NVIDIA is seeking a software-focused Senior Failure Analysis Engineer who can blend deep development with production support ownership. This hybrid role sits at the intersection of software engineering, data infrastructure, semiconductor development, and production tooling - building and sustaining the intelligent platforms and workflows that power failure analysis, debug, and engineering insight at scale. You'll own the reliability and continuous improvement of production-critical FA systems (databases, CAD navigation tools, and analysis platforms) while partnering with failure analysis, design, verification, CAD, infrastructure, and manufacturing teams. This is a high-impact opportunity for someone who thrives on both building robust software and ensuring the tools that semiconductor teams depend on are always fast, reliable, and insightful.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at NVIDIA? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect