Skip to main content
Back to jobs

Senior Software Engineer - AI/ML, AWS Neuron Inference

External
Annapurna Labs (U.S.) Inc. logoAnnapurna Labs (u.s.) · Seattle, WA
Full-timeOn-site1mo ago30+ days old, may be filled
JavaAWSMachine LearningPyTorch
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we're building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.

Requirements

  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • 5+ years of programming using a modern programming language such as Java, C++, or C#, including object-oriented design experience
  • Fundamentals of Machine learning models, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model performance.
  • Master's degree in computer science or equivalent
  • Hands-on experience with PyTorch or Jax - preferably involving developing and deploying LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware.
  • Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
  • USA, WA, Seattle - 168,100.00 - 227,400.00 USD annually

Additional Information

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators. This role is for a senior software engineer in the Machine Learning Inference Applications team. This role is responsible for development and performance optimization of core building blocks of LLM Inference - Attention, MLP, Quantization, Speculative Decoding, Mixture of Experts, etc. The team works side by side with chip architects, compiler engineers and runtime engineers to deliver performance and accuracy on Neuron devices across a range of models. Key job responsibilities Responsibilities of this role include adapting latest research in LLM optimization to Neuron chips to extract best performance from both open source as well as internally developed models. Working across teams and organizations is key.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Annapurna Labs (U.S.) Inc.? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect