Research Engineer - Evals

External

Firecrawl · San Francisco

$160K–$240K/yrFull-timeRemote1mo ago

CI/CDGitHub

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Run fast experiments and communicate clearly. You design experiments that test meaningful hypotheses, run them quickly, and make decisions based on results. When you have findings, anyone on the team can understand what they mean - no decoder ring required.

Requirements

Fast and clear. You'd rather run three rough experiments this week than one polished one next month. When you have results, anyone on the team can unde

Benefits

Remote work optionsEquity / stock options

Additional Information

Research Engineer - Evals You'll build the evaluation systems that tell us whether Firecrawl actually works. That sounds simple. It isn't. Our core promise - convert any URL into clean, structured, LLM-ready data reliably - is hard to measure rigorously across millions of different websites, formats, and edge cases. As we layer in models and agent workflows, the question "did that work?" gets harder, not easier. This isn't an eval role where you inherit a framework and run benchmarks. You'll design the metrics, build the pipelines, generate the datasets, and own the feedback loop from output quality back to model and product decisions. If you care about what "good" actually means and have the engineering depth to measure it, this is the role. Salary Range: $160,000 to $240,000/year (Range shown is for U.S.-based employees in San Francisco, CA. Compensation outside the U.S. is adjusted fairly based on your country's cost of living.) Equity Range: Up to 0.10% Location: San Francisco, CA or Remote (Americas, UTC-3 to UTC-10) Job Type: Full-Time Experience: 3+ years in ML engineering, applied AI, or data quality - with production systems Visa: US Citizenship/Visa required for SF; N/A for Remote About Firecrawl Firecrawl is the easiest way to extract data from the web. Developers use us to reliably convert URLs into LLM-ready markdown or structured data with a single API call. In just a year, we've hit 8 figures in ARR and 120k+ GitHub stars by building the fastest way for developers to get LLM-ready data. We're a small, fast-moving, technical team building essential infrastructure superintelligence will use to gather data on the web. We ship fast and deep.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at firecrawl? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect