Design and build the pipelines that generate synthetic tasks and evaluation environments for AI model training - this is the factory floor of AI development, producing training fuel for next-generation models, not the models themselves
Architect the workflows where AI and humans work together in the loop - deciding what gets automated, what requires human intervention, how state is preserved across handoffs, and how the whole system stays reliable at scale
Own and lead the most complex system design discussions - produce one-page technical scoping documents that surface hidden risks before development begins, define technology stacks, and establish engineering guidelines that let the team move fast without breaking things
Rapidly assess whether a technical idea is worth building - get early signal, align stakeholders, and kill or accelerate accordingly
Partner closely with research, operations, and data teams - juggle multiple workstreams, make smart tradeoff decisions as priorities shift, and translate ambiguous business needs into concrete technical architecture
Build reusable frameworks and engineering guidelines that raise the team's collective execution muscle
Requirements
8+ years of software engineering experience with a track record of owning complex systems end-to-end
A software engineering foundation first - you think in systems, architecture, and engineering tradeoffs, not in models and experiments
Production experience building and shipping agentic workflows, multi-agent orchestration, HITL pipelines, and LLM-powered applications with measurable business outcomes - RAG, vector stores, semantic search, and multi-model LLM stacks in production, not just demos
Battle-tested context engineering practices - you reason clearly about the limits of AI and architect around them
Experience with distributed systems architecture applied to AI or data platforms - reliable, observable, and scalable systems built in service of a product
Daily proficiency with agentic coding tools (Claude Code, Cursor, or equivalent) - you use these to multiply your output, not pad it
A track record of operating in ambiguity - shipping fast, pivoting when wrong, and moving on without ego
Exceptional written and verbal English communication skills - you can lead a design discussion, push back on stakeholders, and document architecture clearly. Communication cannot be a bottleneck
Experience at an AI data company (Scale AI, Surge, Snorkel, Labelbox, or similar) - particularly building synthetic data pipelines, eval environments, or task generation systems. This is the dream background.
Experience building human data labeling interfaces, annotation workflows, or data collection pipelines
Familiarity with preference data and reward models used in AI model training (RLHF, RLVR, or similar)
Proficiency with our stack: Python, TypeScript, AWS, GCP, Terraform, Temporal Cloud, containerization, LLM gateways, RAG frameworks, and data pipeline tooling
Ability to employ data structures and algorithms when forming AI/LLM solutions
Ability to reason about requirements with a bias for Essentialism
Additional Information
About Pareto
Humanity is in a virtuous cycle: human insight improves AI, and better AI expands what people can do. Sustaining it depends on the one input that can't be automated: expert human judgment .
At Pareto, we build the platform that turns that judgment into the data , evals , and RL environments frontier models learn from. We work with leading frontier labs like Anthropic and GDM, and we give skilled people everywhere a way to shape the future of AI and share in what it creates.
This RL environment and human-data infrastructure is already in production. Our job now is to scale it.
You'll be joining an applied AI team that makes the company run better as it grows. As the business scales, manual work piles up - this team steps in and fixes that, building agentic workflows, automation pipelines, and smart systems that handle the complexity so people don't have to. You succeed when the business can grow without things breaking or slowing down.