Senior AI Software Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Ruby Labs is a leading tech company that creates and operates innovative consumer products. We offer a diverse range of opportunities across the health, education, and entertainment industries. Our innovative teams are driving the future of consumer-led products, and we're always looking for passionate individuals to join us. Learn more about our story at: https://rubylabs.com/about-us/ At Ruby Labs we are looking for a Senior AI Engineer to own and drive the quality, reliability, and evolution of our AI systems in production. This is a high-ownership role. You will be responsible for end-to-end delivery of major AI features, production stability of AI systems, and data-driven experimentation using tools like Langfuse, Mixpanel and OpenRouter. You'll work in a modern stack built on Next.js, TypeScript, Node.js, and Redis, collaborating closely with product, growth, data, and billing teams. Increasingly, this includes building agentic, tool-using AI systems - defining clean tool contracts (including MCP-based tools) and orchestrating how AI interacts with internal services and business systems. Our engineering organization uses a squad-based structure. You will operate within an AI engineering squad, contributing as a senior technical voice and driving engineering quality within your area of the product.
Responsibilities
- Take complete ownership and deliver major AI engineering features within agreed timelines.
- Own AI output quality, structure, and predictability across all user-facing AI interactions.
- Design, implement, and maintain output-type-based AI systems, including segmentation, routing, and enforcement.
- Ensure consistent output structure and formatting across different LLMs for the same request type.
- Integrate and orchestrate multiple LLM providers via OpenRouter, managing model selection, fallback strategies, and cost optimisations.
- Design and orchestrate tool-using and agentic AI workflows, defining clean tool contracts (including MCP-based tools), function-calling interfaces, and reliable AI-to-system integrations.
- Build and maintain complex, multi-step LLM workflows, including with orchestration frameworks such as LangChain or LlamaIndex, for advanced reasoning, context reuse, and retrieval.
- Design and manage production prompt systems with dynamic prompting, context injection, and conditional logic.
- Own the deployment and release of LLM experiments, prompt management, and Langfuse-based evaluation pipelines.
- Run A/B tests across models, analyse results, and present data-driven impact assessments of AI features and experiments.
- Monitor AI system metrics, quality signals, latency, and release health using Langfuse and other observability tools.
- Deep-debug complex LLM chains using Langfuse traces, identifying bottlenecks and optimising for cost, latency, and context-window usage, and build output-scoring systems to root-cause hallucinations and logic errors.
- Write clean, scalable, and maintainable TypeScript code across the Next.js and Node.js stack.
- Build reliable backend logic for AI systems, with strong error handling, request validation, fallback flows, and predictable behaviour in production, including reliable tool execution and AI-to-service integrations.
- Ensure high code quality through testing, code reviews, and clear engineering standards.
- Monitor, troubleshoot, and improve production performance, reliability, and system health.
- Drive maintainability and technical quality through solid architecture, refactoring, and disciplined release practices.
Requirements
- 6+ years of backend/full-stack software engineering experience, including production-grade TypeScript/Node.js. Experience with Next.js and/or Python is a plus.
- 2+ years of experience building AI/LLM systems in production. Less experience may be considered for exceptional candidates.
- Deep hands-on experience working with LLM APIs (OpenAI, Anthropic, or similar) in production environments.
- Experience with Agentic AI, multi-agent orchestration, tool-based workflows (function calling/tool execution), and/or RAG pipelines, including indexing, retrieval, and re-ranking.
- Experience with LLM observability tools such as Langfuse, LangSmith, or similar platforms.
- Experience with AI gateways and model routing solutions, such as OpenRouter or equivalent technologies.
- Solid understanding of Redis and relational databases, such as PostgreSQL.
- Exceptional ownership mindset and personal responsibility for engineering quality and delivery.
- Experience with AI-centered development tools such as Cursor, Claude Code, Windsurf, or similar platforms.
- Familiarity with evaluation frameworks, including LLM-as-a-judge, RAGAS, or similar approaches.
- Experience working in high-pressure startup environments with rapid product iteration cycles.
- Experience with MCP (Model Context Protocol), including building MCP servers/clients or designing tool contracts for AI agents.
- Experience with edge a
Benefits
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at ruby-labs? Share your experience