Sr. Staff Performance Tuning Engineer (CPU PMU & Virtualization)
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Performance Analysis and Optimization
- Identify and analyze performance bottlenecks across CPU, memory hierarchy, cache, interconnect, and interrupt subsystems.
- Leverage hardware PMU counters to measure and interpret cycles, instructions, cache misses, branch mispredictions, TLB misses, and stall cycles.
- Compute and analyze performance metrics such as IPC, CPI, MPKI, memory bandwidth utilization, and stall breakdowns.
- Optimize system software, applications, and drivers for throughput, latency, and performance determinism.
- Virtualization Performance Tuning
- Profile and reduce hypervisor overhead, including VM exits, interrupt injection, and stage-2 MMU translation costs.
- Optimize vCPU placement strategies, huge page usage, and IRQ affinity for performance-sensitive workloads.
- Analyze and tune virtio and IOMMU performance paths.
- Investigate cross-core and cross-cluster latency issues, including IPI overhead and interrupt routing on ARM-based systems.
- CPU PMU and Microarchitectural Analysis
- Configure and interpret ARM PMUv3 and related vendor-specific performance monitoring extensions.
- Apply Top-Down Microarchitecture Analysis (TMA) to identify frontend/backend bottlenecks.
- Correlate PMU-derived metrics with kernel traces, source code paths, and system behavior.
- Design and implement custom event sets tailored to workload-specific performance characterization.
- Required Qualifications
- Bachelor's, Master's, or Ph.D. in Computer Science, Electrical Engineering, or a related field.
- 8+ years of experience in system performance engineering or related roles.
- Strong understanding of computer architecture, including: CPU pipelines, execution engines, and memory hierarchy
- Interrupt architectures (e.g., GICv3/GICv4, APIC concepts)
- MMU, virtual memory, and page table structures
- Deep expertise with Linux performance tooling: perf
- eBPF / BCC / bpftrace
- ftrace / trace-cmd
- Proficiency in C/C++, Python, and shell scripting for performance tooling and automation.
Requirements
- Hands-on experience with ARM PMUv3 on Cortex-A55/A76/A78 or Neoverse platforms.
- Experience with one or more virtualization stacks: KVM/QEMU
- Xen
- seL4 or other RTOS/hypervisor environments
Benefits
Additional Information
JOB DESCRIPTION About NIO NIO is a pioneer and a leading company in the premium smart electric vehicle market. Founded in November 2014, NIO's mission is to shape a joyful lifestyle. NIO aims to build a community starting with smart electric vehicles to share joy and grow together with users. NIO designs, develops, jointly manufactures and sells premium smart electric vehicles, driving innovations in next-generation technologies in autonomous driving, digital technologies, electric powertrains and batteries. NIO differentiates itself through its continuous technological breakthroughs and innovations, such as its industry-leading battery swapping technologies, Battery as a Service, or BaaS, as well as its proprietary autonomous driving technologies and Autonomous Driving as a Service, or ADaaS. NIO's product portfolio consists of the ES8, a six-seater smart electric flagship SUV, the ES7 (or the EL7), a mid-large five-seater smart electric SUV, the ES6, a five-seater all-round smart electric SUV, the EC7, a five-seater smart electric flagship coupe SUV, the EC6, a five-seater smart electric coupe SUV, the ET7, a smart electric flagship sedan, and the ET5, a mid-size smart electric sedan.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at nio? Share your experience