Skip to main content
Back to jobs

Research Engineer - Reinforcement Learning

External
firecrawl logoFirecrawl · San Francisco
$180K–$290K/yrFull-timeRemote3mo ago
GitHubLeadershipReinforcement Learning
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Run fast experiments and iterate. You design experiments that test meaningful hypotheses, run them quickly, and make decisions based on results. You don't spend weeks on experiment infrastructure before getting a single result. Speed of iteration is a core part of how you work.
  • Communicate clearly to non-RL people. RL can be opaque. You translate your work into language that engineers, product people, and leadership can understand and act on. You know how to explain why a reward function matters without requiring everyone to read the paper.
  • Collaborate closely with the team. Work directly with the Search/IR-focused Research Engineer and the engineering team to connect RL improvements with search, ranking, and the broader product roadmap.

Requirements

  • Production-minded. You care about whether your models work in production, not just on benchmarks. You've deployed models that serve real traffic and made hard tradeoffs between model quality, latency, and cost. Research that doesn't ship isn't research that matters here.
  • Runs fast experiments and communicates clearly. You'd rather run three rough experiments this week than one polished one next month. When you have results, anyone on the team can understand what they mean - no decoder ring required.
  • Backgrounds that tend to do well: RL engineers at AI labs or applied ML teams who've shipped models to production. Researchers who've done RLHF or reward modeling for LLM systems.

Benefits

Remote work optionsEquity / stock options

Additional Information

Research Engineer - Reinforcement Learning You'll bring reinforcement learning to Firecrawl's core product - building the training infrastructure, reward pipelines, and fine-tuning systems that make our models meaningfully better at extracting, understanding, and structuring web data. This isn't theoretical RL research. You'll build your own training infra, run fast experiments, ship models to production, and bridge the gap between classical RL approaches and modern LLM agent systems. If you care as much about training throughput as you do about reward design, this is the role. Salary Range: $180,000 to $290,000/year (Range shown is for U.S.-based employees in San Francisco, CA. Compensation outside the U.S. is adjusted fairly based on your country's cost of living.) Equity Range: Up to 0.15% Location: San Francisco, CA or Remote (Americas, UTC-3 to UTC-10) Job Type: Full-Time Experience: 3+ years in applied RL, ML engineering, or model training - with production systems Visa: US Citizenship/Visa required for SF; N/A for Remote About Firecrawl Firecrawl is the easiest way to extract data from the web. Developers use us to reliably convert URLs into LLM-ready markdown or structured data with a single API call. In just a year, we've hit 8 figures in ARR and 120k+ GitHub stars by building the fastest way for developers to get LLM-ready data. We're a small, fast-moving, technical team building essential infrastructure superintelligence will use to gather data on the web. We ship fast and deep.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at firecrawl? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect