Skip to main content
Back to jobs

Principal Engineer, Systems Design Engineering

External
Sandisk logoSandisk · Seoul, South Korea
Full-timeOn-site3mo ago
ComplianceNegotiationObservabilitySAFeSystem Design
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Own system-level PCIe Gen5/Gen6 architecture from an NVMe SSD endpoint perspective
  • Define and review PCIe + NVMe integration across SSD products
  • PHY + MAC IP review, integration requirements and constraints
  • SoC/ASIC integration: clocks, resets, power domains, straps, lane mapping, sidebands
  • PCIe SFR + FW guidelines: flow control, LTSSM observability, power states, error handling
  • Link & low power transitions: DLRM, L1, L1SS, L0p, ASPM, clock-down, APST Coordination
  • Bring-up + debug: enumeration, speed negotiation, width detection, stability, AER/error recovery
  • Customer requirement tuning: latency/power, performance, reliability and consistency
  • Provide deep expertise in PCIe configuration and extended capability registers, including:
  • o Link, power management, MSI/MSI-X, AER, BARs, L1SS
  • Lead platform bring-up and debug:
  • o Enumeration, link training, speed negotiation, power states, error handling
  • Act as the technical authority for cross-team and customer escalations
  • Detailed Responsibilities (End-to-End PCIe for NVMe SSD)
  • PHY/MAC IP Review (System Design Perspective)
  • Understand criteria for PHY/MAC/controller IP:
  • o Gen5/Gen6 readiness, equalization capability, margining hooks, lane mapping flexibility
  • o SRNS/SRIS tolerance, clocking modes, power management support
  • o Observability: LTSSM state visibility, error counters, replay/NAK stats, equalization telemetry
  • Review IP documents:
  • o Reset sequences, compliance features, link speed change support
  • o L1SS behavior, CLKREQ#/REFCLK control expectations
  • o AER robustness, surprise down handling, hot/warm reset behavior
  • Specify platform-facing requirements:
  • o Retimer/redriver compatibility assumptions (backplane/adaptor/cables)
  • ASIC/SoC Integration Ownership
  • Integrate PCIe subsystem with:
  • o Clocking: REFCLK handling, clock request gating, clock-down sequences
  • o Resets: PERST# behavior, internal resets, warm/hot resets, FLR support as applicable
  • o Power domains: retention strategies, wake sources, D-state coordination
  • o Sidebands: WAKE#, CLKREQ#, presence detect patterns (platform dependent)
  • Define lane policy:
  • o x4 typical NVMe, lane reversal/polarity, width detection & recovery from degraded width
  • PCIe SFR / Register + FW Design Guidelines
  • Define a clean SFR map that FW uses for:
  • o LTSSM control/observability (state, substate, timers, retries)
  • o Link speed/width control and status (negotiated vs target)
  • o Low-power triggers: ASPM enable/disable, L1SS policy, L0p policy (if implemented)
  • o Clock request & clock gating behavior (safe entry/exit rules)
  • o Error logging counters (replay, NAK, ECRC, timeout, malformed TLPs)
  • o Recovery controls: link disable/enable, retrain, directed speed change, error clear policy
  • Provide FW runbooks:
  • o "What to do when": training fails, width reduces, speed fallback, AER floods
  • o Safe sequencing across power modes and APST transitions
  • Link Bring-Up & Transitions (Sequence Ownership)
  • You'll own/define the exact sequencing rules for:
  • Enumeration readiness
  • o Ensure config space stability, BAR mapping correctness, MSI/MSI-X readiness timing
  • Speed negotiation / Directed Speed Change
  • o When to allow Gen5/Gen4 fallback; policy for stability vs performance
  • Width detection & recovery
  • o Handling degraded width events (x4 → x2) and reporting/telemetry
  • Link power management
  • o ASPM policy and its constraints with NVMe latency targets
  • o L1 entry/exit triggers and guard timers
  • o L1 Substates (L1.1/L1.2) enablement conditions, wake sources, and clock requirements
  • o DLRM handling (as applicable to platform/system) with safe NVMe readiness on resume
  • o L0p (if supported) and interaction with performance bursts
  • Clock down / clock request
  • o Define clock request gating conditions, and safe "no transactions in flight" criteria
  • NVMe APST alignment
  • o Coordinate NVMe power states (APST) with PCIe L-states so you don't create:
  • § long resume latencies (client)
  • § link instability under load (enterprise)
  • Platform Interoperability
  • Own differences across laptop and server:
  • o Client: aggressive power policies, fast resume, frequent idle entry/exit, D3hot/cold patterns
  • o Enterprise: stable performance, high queue depth, error containment, hot-plug-ish behaviors on some platforms
  • Validate across:
  • o Multiple root complexes, BIOS implementations, OS stacks
  • REQUIRED:
  • Masters in Embedded Sytems/ VLSI/MicroElectronics or Bachelors(B.E./B.Tech) in Electronics & Communications/Electricals & Electronics/Computer Science
  • 8-14 years of experience semicondu

Benefits

Paid time off

Additional Information

Role Summary Own the end-to-end PCIe system design for an NVMe SSD product line across client laptops and enterprise servers, from PHY/MAC review through ASIC/SoC integration, PCIe SFR/register analysis, and firmware design guidelines for robust link training, link transitions, low-power behavior. This role sits at the intersection of PCIe spec compliance, NVMe behavior, FW architecture, platform interoperability, and power/performance tuning.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Sandisk? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect