Senior Solutions Architect, CSP System

External

Nvidia · Shanghai, China

Full-timeOn-siteToday

PythonPyTorchiOS

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Partner with Sales, BD and CPM teams to land NVIDIA GPU and AI Infra technologies into top-tier Chinese CSP accounts, drive technical penetration and sustainable business growth.
Serve as the primary technical authority for NVIDIA GPU system and AI infrastructure solutions for Chinese CSPs, providing end-to-end consultation on GPU cluster architecture design, AI workload deployment, heterogeneous computing tuning, and full-stack software stack optimization.
Unlock Vera CPU + GPU co-optimization value for RL training and Agentic AI workloads, eliminate CPU-GPU data movement bottlenecks, optimize end-to-end agent training and reasoning pipeline latency and throughput for CSP AI factory scenarios.
Lead open-source system architecture contributions for NVIDIA AI infra stacks, upstream optimized patches for key open-source projects, build China-localized best practices and shape industry technical standards.
Conduct in-depth GPU workload bottleneck analysis, implement system-level, kernel-level and framework-level tuning for AI training, inference, RL and gaming workloads, deliver production-ready reference designs and tuning guidelines for CSP mass deployment.
Act as the key technical liaison between Chinese CSP customers and NVIDIA global engineering, product and R&D teams, collect high-value local workload requirements, drive product roadmap iteration, and ensure full compliance with NVIDIA global technical policies and export compliance rules.
Lead technical workshops, hands-on training, PoC and production pilot projects for key CSP accounts, quantify and demonstrate GPU/AI Infra business value, accelerate technology adoption and large-scale replication.
Monitor cutting-edge industry trends including Agentic AI, LLM inference optimization, cloud gaming AI, and next-gen data center system architectures, output strategic technical insights to support team and product strategy formulation.
Mentor junior SA team members, standardize CSP technical engagement and solution delivery processes, and drive the precipitation of high-value technical best practices.
What we need to see:
Bachelor's/Master's/PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field; equivalent industry experience is highly valued.
8+ years of hands-on experience in GPU architecture, AI system optimization, large-scale data center infrastructure, or hyperscale cloud computing, with solid experience in AI training/inference, distributed computing or HPC workloads.
Deep understanding of GPU microarchitecture, CUDA programming model, GPU memory hierarchy and system scheduling mechanisms; proficient in performance profiling, bottleneck analysis and end-to-end AI workload tuning.
Strong programming proficiency in C/C++ and Python; familiar with CUDA kernels, compiler toolchains, AI framework optimization (PyTorch/TensorRT) and large-scale distributed system tuning.
Proven hands-on experience working with major Chinese CSPs or global hyperscalers, with in-depth knowledge of their public cloud AI service architectures, cluster operation mechanisms and core workload characteristics.
Excellent technical communication and presentation skills, capable of explaining complex GPU system and AI infra technologies to technical engineers, architecture teams and business stakeholders.
Strong cross-functional collaboration capability, able to work efficiently in a global matrix team and prioritize multiple high-value technical projects under fast-paced business demands.
Familiar with NVIDIA full-stack products (GPU data center hardware, TensorRT-LLM, Dynamo, NCCL, CUDA software stack) is a significant plus.
Hands-on engineering capability is mandatory; candidate must be result-oriented, self-driven and able to independently own end-to-end technical project delivery.
Committed, proactive, and capable of sustaining high-quality technical output for long-term strategic CSP projects.
Ways to stand out from the crowd:
Hands-on experience with Vera/Grace CPU + GPU heterogen

Additional Information

As a Senior GPU & AI Infra Expert focusing on Cloud Service Providers (CSPs) in China, you will be a core technical pillar in NVIDIA's CSP SA team, responsible for driving GPU/AI Infra technical strategy, system-level solution optimization, and high-value customer engagement. You will work closely with major Chinese CSPs to address their critical demands on large-scale AI training/inference, Agentic AI, gaming AI, and distributed computing infrastructure. You will accelerate the mass deployment and performance maximization of NVIDIA GPU software/hardware stacks, and bridge technical gaps between CSP workload iteration and NVIDIA global engineering roadmap. This role requires deep expertise in GPU architecture, AI system optimization, cluster networking, and open-source AI infra contributions, with strong capability to deliver high-value technical outcomes for hyperscale data center workloads.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at NVIDIA? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect