Senior Engineering Leader - AI Infrastructure and Inferencing
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
We're seeking an exceptional Senior Engineering Leader to build and lead a high-performing engineering team focused on design and development of a distributed multi-tenant AI inference SaaS platform. Platform development responsibilities include but are not limited to software design, development and testing for multiple domains such as inference engines (AI/ML, program and compiler analysis), core platform services, and observability. This role sits at the intersection of systems engineering, AI/ML operations, and product development, requiring both deep technical expertise and proven leadership capabilities. As a leader at Gruve, you'll drive the technical vision and execution of critical infrastructure that enables our AI capabilities at scale. You'll work closely with cross-functional partners including research scientists, product managers, and other engineering leaders to deliver robust, performant systems that power our AI products. This position is based in the United States and reports to the SVP of Inferencing and Infrastructure Management
Responsibilities
- Team Leadership & Development: Build, mentor, and scale a world-class engineering team of 10-15+ engineers. Foster a culture of technical
- Excellence, collaboration, and continuous learning. Conduct performance reviews, career development planning, and succession planning.
- Technical Strategy & Architecture: Define and execute the technical roadmap for AI inference infrastructure, AI toolchains, and AI software development. Make critical architectural decisions that balance performance, scalability, maintainability, and cost.
- Compiler Design & Optimization: Lead the development of AI inference systems and optimizations for AI workloads, including graph optimization, kernel fusion, and hardware-specific code generation to maximize inference performance.
- AI Model Development & Deployment: Oversee the end-to-end lifecycle of AI models from development through production deployment, including model fine-tuning, quantization, distillation, and serving infrastructure.
- Inference API & Platform Development: Drive the design and implementation of scalable, low-latency inference APIs and platforms that serve models reliably at production scale with strict SLA requirements.
- Spec-Driven Development: Champion rigorous engineering practices including comprehensive technical specifications, design reviews, and documentation to ensure alignment and quality across complex projects.
- Cross-Functional Collaboration: Partner effectively with research, product, and business stakeholders to translate requirements into technical solutions and communicate progress, trade-offs, and risks clearly.
- Delivery & Execution: Own quarterly planning, roadmap prioritization, and on-time delivery of major initiatives. Establish metrics and KPIs to measure team performance and system health.
Requirements
- 10-15+ years of software engineering experience with at least 5+ years in engineering leadership roles managing teams of 5+ engineers
- Proven track record of building and scaling high-performing engineering teams in high-growth technology companies
- Deep expertise in systems programming languages (C++, Go, Rust, or similar) and architecture design
- Strong background in AI model design, optimization, or adjacent systems-level programming (LLVM, MLIR, XLA, or similar frameworks)
- Hands-on experience with AI/ML model development, training, and inference systems
- Experience with model fine-tuning techniques and deployment optimization (quantization, pruning, etc.)
- Demonstrated ability to design and build production-grade APIs and distributed systems
- Strong understanding of spec-driven development processes and engineering best practices - Excellent communication skills with ability to influence across all levels of the organization
- Demonstrated ability to work effectively with teams across multiple time zones
- Bachelor's degree in Computer Science, Engineering, or related technical field (or equivalent practical experience)
- Master's or PhD in Computer Science, Machine Learning, or related field
- Experience at leading AI/ML companies or research labs (OpenAI, Google DeepMind, Meta AI, Anthropic, etc.)
- Direct experience with modern ML frameworks (PyTorch, JAX, TensorFlow) and their compilation stacks
- Background in GPU programming (CUDA, Triton) and hardware acceleration for ML workloads
- Experience with transformer architecture
Benefits
Additional Information
About Gruve Gruve is an innovative software services startup dedicated to transforming enterprises to AI powerhouses. We specialize in cybersecurity, customer experience, cloud infrastructure, and advanced technologies such as Large Language Models (LLMs). Our mission is to assist our customers in their business strategies utilizing their data to make more intelligent decisions. As a well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at gruve? Share your experience