Senior GenAI Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
We've put AI agents in front of CookUnity members. Our AI Nutritionist talks to people, reasons over our menu and their goals, and takes real actions on their behalf, like building a cart for the week. It runs in production today, and it's the first of several agents we plan to ship. We're hiring a Senior AI Engineer to own the technical direction of the platform underneath those agents: the runtime, tools, memory, guardrails, evaluation, and observability they all rely on. This is a hands-on role, not an oversight one. You'll build agents the whole way through, from a rough prototype to the production runtime serving members to the Terraform that deploys it. You'll also be thinking a few agents ahead, taking the parts that work and turning them into reusable building blocks so the team stops rewriting the hard stuff every time. We care about your judgment with LLMs and agents much more than any framework on your CV. Frameworks come and go. The hard parts (grounding, tool design, memory, safety, evaluation, cost, latency) stay.
Responsibilities
- Own agents end to end. Take a feature from prototype to production: orchestrator and sub-agent design, the tools the agent calls, system prompts, memory, and the response contract the frontend renders from. You write the code that ships.
- Make the tools trustworthy. Build the tool layer agents depend on, like search grounded in our real catalog, retrieval and reranking, and cart and account actions. Keep credentials and member identity out of anything the model can control.
- Own safety. Build the layered safety model: input and output guardrails, intent and clarification handling, refusals, and PII boundaries. Decide what gets hard-enforced and what the agent handles in its own reasoning. Nutrition advice raises the stakes here, so this matters.
- Make quality measurable. Push our evaluation work forward: structured checks plus LLM-as-judge, with a review queue for the cases the judges disagree on. If we can't measure whether a prompt, model, or tool change helped, we don't ship it.
- Instrument it. Make agents debuggable in production with per-session and per-turn timelines, tool and guardrail traces, and token and cost visibility. When an answer looks wrong, someone should be able to see why in minutes.
- Turn it into a platform. Take the patterns that work and make them reusable, so the next agent and the next engineer inherit the runtime conventions, the eval scaffolding, and the guardrail defaults instead of starting over.
- Make the team better. Set technical direction across the agent codebases and infra, and keep design and code review sharp. Help product and data partners work out when an agent is the right answer, and when it isn't.
- What Success Looks Like:
- The agents in front of members get measurably better, more grounded and safer at a lower cost per turn, and we can show it in the evals instead of arguing about it.
- Shipping a new agent capability costs a fraction of what it used to, because the runtime, memory, guardrail, and eval patterns are reusable.
- Quality and safety regressions get caught in evaluation and observability before members feel them.
- Other engineers reach for your patterns by default and get better from how you review and design.
- Minimum Requirements:
- Real production experience building with LLMs and agents. This is the one hard requirement.
- Good judgment on the hard parts: grounding and retrieval, tool and context design, memory, cost and latency, safety, and how to tell whether any of it is working.
- You can look at one working agent and see the reusable pattern in it, and you know when not to over-engineer.
- Strong Python, plus enough range across APIs, cloud, and infrastructure-as-code to own a feature from the model call down to the deploy.
- A track record of setting technical direction and making the engineers around you better
Additional Information
About CookUnity: Food has lost its soul to modern convenience. And with it, it has lost the power to nourish, inspire, and connect us. So in 2018, CookUnity was founded as the first-of-its-kind platform that connects the world with the source of truly great food: chefs. Today, CookUnity delivers 50 million meals a year from the industry's best chefs to homes all over the country. Fresh. Ready-to-eat. And crafted with the passion that nourishes body and soul. Unwilling to stop there, CookUnity is expanding beyond delivery to become an ever-innovating marketplace focused on our singular mission: empower Chefs to nourish the world. If that mission has you hungry in more ways than one, you've found the right job posting.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at cookunity? Share your experience