Skip to main content
Back to jobs

Machine Learning Infrastructure Engineer

External
character logoCharacter · Redwood City, CA
$150K–$350K/yrFull-timeRemote14mo ago
KubernetesPyTorchTensorFlow
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

We're looking for seasoned ML Infrastructure engineers with experience designing, building and maintaining training and serving infrastructure for ML research.

Responsibilities

  • Provide infrastructure support to our ML research and product
  • Build tooling to diagnose cluster issues and hardware failures
  • Monitor deployments, manage experiments, and generally support our research
  • Maximize GPU allocation and utilization for both serving and training

Requirements

  • 4+ years of experience supporting the infrastructure within an ML environment
  • Experience in developing tools used to diagnose ML infrastructure problems and failures
  • Experience with cloud platforms (e.g., Compute Engine, Kubernetes, Cloud Storage)
  • Experience working with GPUs
  • Experience with large GPU clusters and high-performance computing/networking
  • Experience with supporting large language model training
  • Experience with ML frameworks like Pytorch/TensorFlow/JAX
  • Experience with GPU kernel development
  • About Character.AI
  • In just two years, we achieved unicorn status and were honored as Google Play's AI App of the Year-a testament to our innovative technology and visionary approach.
  • Join us and be a part of establishing this new entertainment paradigm while shaping the future of Consumer AI!

Benefits

Vision insurance

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at character? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect