Principal Engineer, AI Serving Framework Architect (Software)

External

Samsungsemiconductor · San Jose, CA

Full-timeOn-site2w ago

IoTPythonPyTorchRAGREST

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

As a Tech Lead, leading research teams in Korea and proposing technical direction
Research on dynamic scheduling methodologies for maximizing AI inference performance in multi-rack scale memory-centric systems, comprised of heterogeneous compute-capable memory and hierarchical memory
Investigating methods to accelerate search operations in RAG's vector DB and AI Agent's knowledge-graph by leveraging compute-capable memory
Studying strategies for optimally placing KVCache and a vector DB in hierarchical memory to minimize frequent SSD accesses and reduce IO stalls
Proposing SW design for implementing the derived optimization algorithms on open-source platforms such as vLLM
What You Bring
PhD in Computer Science or a related field with 10+ years of experience in AI Serving Framework for large-scale computing, with focusing on the AI workloads.
Led a project to build and optimize a Large Language Model (LLM) Inference Software Stack on a multi-rack scale system to deliver AI Inference services to over 100,000 users.
Extensive experience in designing AI Inference Software Stacks for heterogeneous devices.In-depth understanding of the internal architecture and operation mechanisms of inference engines such as vLLM.
Proficiency in AI Inference System Profiling and optimization.
Knowledge and practical experience with future AI workloads, including reasoning models, multi-modal solutions, AI agents, and world models.
Strong understanding of compute, memory, and networking bottlenecks in AI systems.
Required skillsets: PyTorch, Python, and C++
A collaborative mindset, curiosity, and resilience in solving complex challenges.
Excellent verbal, presentation, and written communication skills.
(Nice to have) Native or fluent Korean speakers are preferred.
You're inclusive, adapting your style to the situation and diverse global norms of our people.
You approach challenges with curiosity and resilience, seeking data to help build. Understanding.
You're collaborative, building relationships, humbly offering support and openly welcoming approaches.
Innovative and creative, you proactively explore new ideas and adapt quickly to change
#LI-SF1

Benefits

Give Back With a charitable giving match and frequent opportunities to get involved, we take an active role in supporting the community.Enjoy Time Away You'll start with 4+ weeks of paid time off a year, plus holidays and sick leave, to rest and recharge.Care for Family Whatever family means to you, we want to support you along the way-including a stipend for fertility care or adoption, medical travel support, and virtuaDental insuranceVision insurance401(k)Flexible schedule

Additional Information

Please Note: To provide the best candidate experience amidst our high application volumes, each candidate is limited to 10 applications across all open jobs within a 6-month period. Advancing the World's Technology Together Our technology solutions power the tools you use every day--including smartphones, electric vehicles, hyperscale data centers, IoT devices, and so much more. Here, you'll have an opportunity to be part of a global leader whose innovative designs are pushing the boundaries of what's possible and powering the future. We believe innovation and growth are driven by an inclusive culture and a diverse workforce. We're dedicated to empowering people to be their true selves. Together, we're building a better tomorrow for our employees, customers, partners, and communities. Job Title: Principal engineer, AI Serving Framework Architect (Software) The Architecture Research Lab (ARL) focuses on addressing fundamental system-level bottlenecks in modern AI, particularly in memory capacity/bandwidth and system-scale communication . By leveraging Samsung's world-class memory technologies, ARL explores and defines next-generation AI system architectures that deliver step-function improvements in performance, efficiency, and scalability. We are seeking a Principal AI System Architect who will play a key role in bridging AI workloads, system architecture, and hardware design . In this role, you will develop system-level performance models, drive architecture-level design decisions, and propose forward-looking AI system architectures that shape Samsung's long-term AI platform strategy. Location : Daily onsite presence at our San Jose office in alignment with our Flexible Work policy Job ID : 42853

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at samsungsemiconductor? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect