Anthropic Fellows Program - Reinforcement Learning

External

Anthropic · Remote

Full-timeRemote1mo ago30+ days old, may be filled

CADPythonReinforcement LearningSAFe

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

4 months of full-time research
Direct mentorship from Anthropic researchers
Access to a shared workspace (in either Berkeley, California or London, UK)
Connection to the broader AI safety and security research community
Weekly stipend of 3,850 USD / 2,310 GBP / 4,300 CAD + benefits (these vary by country)
Funding for compute (~$15k/month) and other research expenses
Interview process
The interview process will include an initial application & reference check, technical assessments & interviews, and a research discussion.

Benefits

The expected base stipend for this role is 3,850 USD / 2,310 GBP / 4,300 CAD per week, with an expectation of 40 hours per week for 4 months (with possible extension).Fellows workstreamsDue to the success of the Anthropic Fellows for AI Safety Research program, we are now expanding it across teams at Anthropic. We expect there to be significant overlap in the types of skills and responsibilities across the roles and will by default consider candidates for all the workstreams.Some of the workstreams may include unique assessment steps; we therefore ask you for workstream preferences in the application . You can see an overview of the current workstreams below:AI Safety FellowsAI Security FellowsML Systems & Performance FellowsReinforcement Learning FellowsEconomics & Societal Impacts FellowsThis page is specific to one of the Anthropic Fellows Workstreams, see also the main Anthropic Fellows posting .Across the workstreams, you may be a good fit if you:Are motivated by making sure AI is safe and beneficial for society as a wholeAre excited to transition into empirical AI research and would be interested in a full-time role at AnthropicHave a strong technical background in computer science, mathematics, or physicsThrive in fast-paced, collaborative environmentsCan implement ideas quickly and communicate clearlyStrong candidates may also have:Strong background in a discipline relevant to a specific Fellows workstream (e.g. economics, social sciences, or cybersecurity)Experience in areas of research or engineering related to their workstreamCandidates must be:Fluent in Python programmingAvailable to work full-time on the Fellows programMentors, research areas, & past projectsFellows will undergo a project selection & mentor matching process. Potential research areas and mentors include:Ruhua JiangKaidi CaoSunny DuanDavid BrandfonbrenerColt SteeleDino DistefanoWill WilliamsProjects in this workstream may include:Building model-based tools to better understand AI training data and improve training data qualityA research project to better understand generalizationCreating RL environments to improve Claude models at capabilities that are within your domai

Additional Information

About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. Apply using this link . We are accepting applications on a rolling basis for the next cohort of Anthropic Fellows, which is expected to start in late September. In some circumstances, we can accommodate fellows starting outside the usual cohort timelines - please note in your application if the September start date doesn't work for you. This page is specific to one of the Anthropic Fellows Workstreams, see also the main Anthropic Fellows posting . Anthropic Fellows Program overview The Anthropic Fellows Program is designed to foster AI research and engineering talent. We provide funding and mentorship to promising technical talent - regardless of previous experience. Fellows will primarily use external infrastructure (e.g. open-source models, public APIs) to work on an empirical project aligned with our research priorities, with the goal of producing a public output (e.g. a paper submission). In one of our earlier cohorts, over 80% of fellows produced papers. We run multiple cohorts of Fellows each year and review applications on a rolling basis. This application is for cohorts starting in July 2026 and beyond.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Anthropic? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect