Lead Machine Learning Engineer- 8+ years (Individual Contributor)
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Build reliable, production-grade services and APIs for serving models for both internal and external products.
- Automate training, evaluation, deployment, rollback, monitoring, and retraining workflows.
- Improve latency, throughput, reliability, and cost efficiency of inference systems.
- Profile and optimize model execution utilizing batching, caching, parallelism, quantization, and architecture-aware improvements.
- Improve the engineering rigor and quality through testing, CI/CD, observability, reproducibility, and incident response.
- Collaborate with product, platform, and software teams to turn ambiguous business problems into production ML systems.
- 8+ years of experience in software engineering, machine learning engineering, or ML infrastructure.
- Strong experience building and operating production ML systems as a self-directed owner.
- Deep expertise in Python and solid backend engineering fundamentals, including APIs, distributed systems, testing, debugging, and operational ownership.
- Proven track record building production data or ML systems on AWS.
- Comprehensive understanding of serving tradeoffs such as latency, throughput, autoscaling, concurrency, GPU or accelerator usage, memory pressure, costs.
- Experience automating the ML lifecycle from experimentation and training through deployment and monitoring.
- Hands-on experience with PyTorch or equivalent modern ML frameworks.
- Ability to drive technical work end-to-end and make pragmatic architectural decisions.
- Excellent communication skills and willingness to mentor engineers while remaining deeply hands-on.
Requirements
- Experience serving and optimizing LLMs in production.
- Experience with computer vision, image understanding, vision transformers, or multimodal retrieval.
- Experience with Kubernetes, containerization, or other large-scale distributed serving systems.
- Experience with inference optimization techniques such as dynamic batching, KV or prefix caching, quantization, or model parallelism.
- Experience with modern inference stacks such as vLLM, Triton, TensorRT, or similar.
- Experience building evaluation, observability, and monitoring workflows for ML systems.
Benefits
Additional Information
Role Overview: We are hiring a highly motivated Lead Machine Learning Engineer to build and scale production ML systems across text and image modalities. This is a hands-on individual contributor role for someone who can independently design and ship robust inference backends, automate training and deployment workflows, and improve model performance across both traditional ML and modern deep learning systems. You will work to productionize models ranging from LLMs, transformers, embeddings, retrieval systems, and classical ML models (such as XGBoost). This role will balance focus between scaling inference backends and training/deployment automation. We are looking for someone who is comfortable operating with a high degree of autonomy, mentoring other engineers, and making strong technical decisions in a fast-moving environment.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Thenielsencompany? Share your experience