Skip to main content
Back to jobs

Software Engineer, Inference

External
Pulse logoPulse · San Francisco
$150K–$230K/yrFull-timeOn-site10mo ago
CachingCapacity PlanningComputer VisionNLPPython
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

Pulse is tackling one of the most persistent challenges in data infrastructure: extracting accurate, structured information from complex documents at scale. We have a breakthrough approach to document understanding that combines intelligent schema mapping with fine-tuned extraction models where legacy OCR and other parsing tools consistently fail. We are a small, fast-growing team of engineers in San Francisco powering Fortune 100 enterprises, YC startups, public investment firms, and growth-stage companies. We are backed by tier 1 investors and growing quickly. What makes our tech special is our multi-stage architecture: Layout understanding with specialized component detection models Low-latency OCR models for targeted extraction Advanced reading-order algorithms for complex structures Proprietary table structure recognition and parsing Fine-tuned vision-language models for charts, tables, and figures If you are passionate about the intersection of computer vision, NLP, and data infrastructure, your work at Pulse will directly impact customers and shape the future of document intelligence. Specialize in low-latency, high-throughput inference for OCR and multimodal models. Own profiling, batching, and autoscaling across single-tenant and multi-tenant environments.

Responsibilities

  • Build inference services with smart batching and caching
  • Optimize kernels, tokenization, and model graphs
  • Evaluate vLLM, TensorRT LLM, and Triton tradeoffs
  • Implement autoscaling and admission control with clear SLOs
  • Own performance dashboards and capacity planning

Requirements

  • 5 days in-office at our San Francisco office
  • Eager to learn and adapt quickly
  • Prior startup or founding experience is a plus
  • 3+ years in performance engineering or ML systems
  • Strong Python, plus C++ or CUDA exposure
  • Experience with GPU profiling and model serving
  • Experience reducing p95 and cost in production ML systems
  • Sponsorship
  • Sponsorship available.
  • Compensation and benefits
  • Competitive base salary plus equity, performance-based bonus, relocation assistance for Bay Area moves, daily meal stipend, medical, vision, and dental coverage.

Benefits

Dental insuranceVision insuranceEquity / stock optionsPerformance bonus

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Pulse? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect