Senior Engineering Leader - AI Infrastructure and Inferencing

External

Gruve · Redwood City, CA

Full-timeOn-site1mo ago

Cross-functional CollaborationDocumentationLeadershipLLMsMachine LearningObservability

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

We're seeking an exceptional Senior Engineering Leader to build and lead a high-performing engineering team focused on design and development of a distributed multi-tenant AI inference SaaS platform. Platform development responsibilities include but are not limited to software design, development and testing for multiple domains such as inference engines (AI/ML, program and compiler analysis), core platform services, and observability. This role sits at the intersection of systems engineering, AI/ML operations, and product development, requiring both deep technical expertise and proven leadership capabilities. As a leader at Gruve, you'll drive the technical vision and execution of critical infrastructure that enables our AI capabilities at scale. You'll work closely with cross-functional partners including research scientists, product managers, and other engineering leaders to deliver robust, performant systems that power our AI products. This position is based in the United States and reports to the SVP of Inferencing and Infrastructure Management

Responsibilities

Team Leadership & Development: Build, mentor, and scale a world-class engineering team of 10-15+ engineers. Foster a culture of technical
Excellence, collaboration, and continuous learning. Conduct performance reviews, career development planning, and succession planning.
Technical Strategy & Architecture: Define and execute the technical roadmap for AI inference infrastructure, AI toolchains, and AI software development. Make critical architectural decisions that balance performance, scalability, maintainability, and cost.
Compiler Design & Optimization: Lead the development of AI inference systems and optimizations for AI workloads, including graph optimization, kernel fusion, and hardware-specific code generation to maximize inference performance.
AI Model Development & Deployment: Oversee the end-to-end lifecycle of AI models from development through production deployment, including model fine-tuning, quantization, distillation, and serving infrastructure.
Inference API & Platform Development: Drive the design and implementation of scalable, low-latency inference APIs and platforms that serve models reliably at production scale with strict SLA requirements.
Spec-Driven Development: Champion rigorous engineering practices including comprehensive technical specifications, design reviews, and documentation to ensure alignment and quality across complex projects.
Cross-Functional Collaboration: Partner effectively with research, product, and business stakeholders to translate requirements into technical solutions and communicate progress, trade-offs, and risks clearly.
Delivery & Execution: Own quarterly planning, roadmap prioritization, and on-time delivery of major initiatives. Establish metrics and KPIs to measure team performance and system health.

Requirements

10-15+ years of software engineering experience with at least 5+ years in engineering leadership roles managing teams of 5+ engineers
Proven track record of building and scaling high-performing engineering teams in high-growth technology companies
Deep expertise in systems programming languages (C++, Go, Rust, or similar) and architecture design
Strong background in AI model design, optimization, or adjacent systems-level programming (LLVM, MLIR, XLA, or similar frameworks)
Hands-on experience with AI/ML model development, training, and inference systems
Experience with model fine-tuning techniques and deployment optimization (quantization, pruning, etc.)
Demonstrated ability to design and build production-grade APIs and distributed systems
Strong understanding of spec-driven development processes and engineering best practices - Excellent communication skills with ability to influence across all levels of the organization
Demonstrated ability to work effectively with teams across multiple time zones
Bachelor's degree in Computer Science, Engineering, or related technical field (or equivalent practical experience)
Master's or PhD in Computer Science, Machine Learning, or related field
Experience at leading AI/ML companies or research labs (OpenAI, Google DeepMind, Meta AI, Anthropic, etc.)
Direct experience with modern ML frameworks (PyTorch, JAX, TensorFlow) and their compilation stacks
Background in GPU programming (CUDA, Triton) and hardware acceleration for ML workloads
Experience with transformer architecture

Benefits

Health insuranceVision insurance

Additional Information

About Gruve Gruve is an innovative software services startup dedicated to transforming enterprises to AI powerhouses. We specialize in cybersecurity, customer experience, cloud infrastructure, and advanced technologies such as Large Language Models (LLMs). Our mission is to assist our customers in their business strategies utilizing their data to make more intelligent decisions. As a well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at gruve? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect