Machine Learning Research Engineer, GenAI Applied ML

External

Scale Ai · San Francisco, CA

$190K–$237K/yrFull-timeOn-site1mo ago

AWSExcelGCPLLMsMicroservicesPrototyping

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Ernst & Young, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications. We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status. We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at accommodations@scale.com. Please see the United States Department of Labor's Know Your Rights poster for additional information. We comply with the United States Department of Labor's Pay Transparency provision . PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities

Requirements

Experience prototyping agent evaluation/reliability systems
Human-in-the-loop or annotation pipeline work
Open-source contributions in agents, evaluation, or alignment
Publications on agent reliability (NeurIPS, ICML, ICLR)
Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is:
$189,600 - $237,000 USD
PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.

Benefits

Health insuranceDental insuranceVision insurancePaid time offEquity / stock options

Additional Information

About This Role Lead applied ML engineering on Scale's Applied ML team, powering data infrastructure for leading agentic LLMs (ChatGPT, Gemini, Llama). You will build scalable multi-agent systems to validate agentic reasoning and behaviors, scale human expertise, and drive research into real-world agent reliability failures despite strong benchmarks, shipping production fixes. Ideal for exceptional engineers with deep research rigor and a relentless focus on practical, high-impact systems. You will iterate rapidly with data, leverage AI tools to accelerate development, and collaborate tightly across engineering, product, and research. If you excel at turning frontier agent research into reliable deployed systems, we want to hear from you. You will: Build and deploy multi-agent systems for agentic reasoning validation Develop pipelines to detect errors and scale human judgment Combine classical ML, LLMs, and multi-agent techniques for reliability Lead research into agent failure modes and ship fixes Use AI tools to speed prototyping and iteration Build data-driven evaluations and deploy rapid improvements Integrate systems into Scale's platform Ideally You'll Have: PhD or MSc in Computer Science, Mathematics, Statistics, or related field 3+ years shipping scaled production ML systems Demonstrated real-world impact Mastery of PyTorch, TensorFlow, JAX, or scikit-learn Deep expertise in agentic LLMs and multi-agent systems Strong software engineering and microservices (AWS/GCP) Rapid, data-driven iteration Proficiency using AI tools to accelerate work Strong research depth with practical bias Excellent cross-functional communication

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Scale AI? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect