Senior Software Development Engineer , Stores Foundational AI - Rufus
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Requirements
- 5+ years of non-internship professional software development experience
- 5+ years of programming with at least one software programming language experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience as a mentor, tech lead or leading an engineering team
- Experience with vLLM, SGLang, TensorRT or similar platforms in production environments
- Experience with CUDA kernels or ML/low-level kernels
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
- Experience with Machine Learning and Large Language Model fundamentals, including architecture, training/inference lifecycles, and optimization of model execution
- Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
- USA, CA, Palo Alto - 193,300.00 - 261,500.00 USD annually
- USA
Additional Information
We are building foundational LLMs for Amazon Stores that fuse world knowledge with deep e-commerce understanding to power next-generation shopping experiences. These systems continuously learn from real-world customer interactions to become more helpful, personalized, and context-aware over time. We are looking for builders who are passionate about large-scale systems, AI innovation, and customer impact. You will work at the intersection of distributed systems, machine learning infrastructure, and science to bring frontier research-especially in post-training and reinforcement learning-into production at Amazon scale. Key job responsibilities * Architect and build scalable ML infrastructure powering LLM training and post-training workflows, including supervised fine-tuning, reinforcement learning, and continuous learning from live traffic * Transform real-world customer interactions into high-quality training signals, enabling continuous model improvement and better customer experiences * Build and optimize post-training and RL systems, including reward modeling, policy optimization, data collection loops. * Drive experimentation and iteration velocity by building tooling and frameworks that enable rapid hypothesis testing, signal validation, and model quality improvements * Partner closely with applied scientists to translate frontier techniques (e.g., RLHF, agentic workflows, multi-turn optimization) into reliable, production-grade systems * Own systems end-to-end, including design, implementation, deployment, observability, and operational excellence * Raise the engineering bar through technical leadership, design reviews, and mentorship, influencing best practices across the organization
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Amazon.com Services LLC? Share your experience