Skip to main content
Back to jobs

Software Engineer, Web Crawling

External
exa logoExa · San Francisco, CA
Full-timeOn-site11mo ago
Design SystemsHubSpotJavaScriptPlaywrightTypeScriptVector Databases
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Requirements

  • You have extensive experience building and scaling web crawlers, or would be excited to ramp up very quickly
  • You have experience with some high performance language (C++, Rust, etc.)
  • You are familiar with TypeScript, Playwright, modern web design, CDP (Chrome DevTools Protocol)
  • You're comfortable optimizing a system to an exceptional degree
  • You care about the problem of finding high quality knowledge and recognize how important this is for the world
  • What You Could Do
  • Build a distributed crawler that can handle 100M+ pages per day
  • Optimize crawl politeness and rate limiting across thousands of domains
  • Design systems to detect and handle dynamic content, JavaScript rendering, and anti-bot measures
  • Create intelligent crawl scheduling and prioritization algorithms for maximum coverage efficiency
  • Logistics
  • Location: This is an in-person opportunity in San Francisco.
  • Visas: We're happy to sponsor international candidates (e.g., STEM OPT, OPT, H1B, O1, E3). While we cannot guarantee your visa, we have historically been successful in sponsoring candidates from all over the world. If you receive an offer, our team will work hard to get you a visa.
  • Benefits: We offer premium healthcare benefits (medical, dental, vision), fertility benefits, 16 weeks of fully paid parental leave for all new parents, and a monthly wellness stipend to all of our employees.

Benefits

Health insuranceDental insuranceVision insuranceParental leave

Additional Information

Exa is an applied AI lab building a search engine unlike the world has ever seen. We build massive-scale infra to crawl the entire web, train state-of-the-art embedding models to process it, and design super high performant vector databases to retrieve over it. We now power search for Cursor, Cognition, HubSpot, and over 400,000 developers and have raised $350m from Lightspeed, Benchmark, and a16z. Our ultimate goal is to build perfect search over all the world's information, far beyond Google. If you want to build massive-scale ML systems that will define the way the new AI world consumes information, this is the place for you. As a Web Crawler engineer, you'd be responsible for crawling the entire web. Basically build Google-scale crawling!


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at exa? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect