Skip to main content
Back to jobs

Senior Systems Software Engineer, Performance Architecture - Analytics and Data Intelligence

External
NVIDIA logoNvidia · Santa Clara, CA
Full-timeOn-siteToday
PythonSQLExpressCI/CDMachine Learning
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Extend JIT and compiler-based execution support in cuDF and related
  • GPU-accelerated structured data processing systems.
  • Design approaches for lowering expressions, ASTs, or query fragments into optimized GPU execution paths.
  • Investigate kernel fusion strategies across cuDF operations to reduce materialization, memory traffic, launch overhead, and end-to-end query latency.
  • Analyze structured analytics workloads to identify performance bottlenecks in expression evaluation, joins, aggregations, scans, data movement, and memory management.
  • Build benchmarks and regression tests that capture real dataframe and SQL-like workloads, from micro-benchmarks to end-to-end pipelines.
  • Collaborate with cuDF, CUDA, compiler/runtime, and query engine teams to translate workload analysis into implementation plans and architecture decisions.
  • Prototype and evaluate execution strategies inspired by high-performance database engines, including fused operators, code generation, vectorized execution, and adaptive planning.
  • What we need to see:
  • Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field, or equivalent hands-on experience.
  • 12+ years of validated experience in systems performance engineering or performance-focused architecture.
  • Proven skills in profiling, instrumentation, and optimization for CPU and GPU systems, applying tools like tracing, counters, flame graphs, and kernel-level profiling.
  • Experience with compiler, JIT, code generation, query execution, or runtime optimization techniques.
  • Experience optimizing analytic database engines and/or query runtimes, including vectorized execution, join strategies, and columnar formats like Arrow and Parquet.
  • Proficient in C++ and/or Python, with a strong ability to analyze performance-critical code and implement effective solutions.
  • Experience with cuDF, RAPIDS, CUDA, Numba, LLVM, MLIR, NVRTC, or other JIT/codegen systems.
  • Experience with benchmarking frameworks, performance dashboards, and CI/CD regression gating, along with a proven grasp of modern analytics and machine learning workflows.
  • Ways to stand out from the crowd:
  • Deep familiarity with NVIDIA GPUs and GPU programming (CUDA), including memory hierarchy, concurrency, and profiling toolchains such as Nsight Systems.
  • Experience with TPC-style benchmarking (TPC-H, TPC-DS, or analogous), Click-Bench-like workloads, and building credible, repeatable performance narratives.
  • Prior work on database execution engines, especially operator fusion, query compilation, vectorized execution, or adaptive execution.
  • Demonstrated open-source contributions to performance-critical systems, including libraries, runtimes, databases, and ML or data tooling.
  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD. You will also be eligible for equity and benefits .
  • Applications for this job will be accepted at least until June 16, 2026. This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.

Additional Information

NVIDIA's Analytics and Data Intelligence (ADI) organization is building the next generation of GPU-accelerated data analytics, data science, and vector search systems, spanning libraries, engines, and end-to-end reference architectures. As a NVIDIAN, you will find yourself immersed in a diverse, encouraging environment where everyone is encouraged to do their best work. Come join the team and see how you can make a lasting impact on the world! We are seeking a Senior Systems Software Engineer focused on performance architecture for GPU-accelerated structured data processing. This is a high-impact individual contributor role for someone passionate about developing coordinated SQL and user-friendly interfaces across diverse CPU and GPU query engines. It involves improving performance, reliability, and workload optimization. The ideal candidate has deep experience in systems performance, compiler/runtime design, and database or dataframe execution engines. This role will focus on compiler and JIT-based execution techniques for cuDF and related analytics runtimes.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at NVIDIA? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect
Senior Systems Software Engineer, Performance Architecture - Analytics and Data Intelligence at Nvidia