Staff Software Engineer, Inference
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Technical Leadership & Architecture
- Define and drive multi-service, cross-team architecture for CoreWeave's inference platform.
- Lead complex design initiatives spanning request routing, adaptive scheduling, GPU resource management, and cost-per-token optimization .
- Set technical direction for low-latency, high-throughput inference systems operating under strict P99 SLAs.
- Performance & Optimization
- Lead development of advanced inference optimizations, including:
- micro-batching and dynamic scheduling
- speculative decoding
- KV-cache reuse and memory optimization
- Drive measurable improvements in P95/P99 latency, throughput, and GPU utilization across the platform.
- Establish performance benchmarking frameworks and guide teams in data-driven optimization strategies .
- Reliability & Systems Excellence
- Own system-wide SLIs/SLOs and ensure reliability improvements across releases.
- Lead efforts in capacity planning, autoscaling strategies, traffic management, and failure mitigation .
- Drive post-incident analysis and ensure systemic improvements across teams.
- Cross-Team Influence
- Partner with Product, Orchestration, Networking, Storage, and Hardware teams to deliver a cohesive inference platform .
- Influence technical decisions across org boundaries and ensure alignment on architecture, performance, and scalability goals .
- Serve as a go-to technical expert for inference systems and distributed AI workloads.
- Mentorship & Engineering Excellence
- Mentor senior and mid-level engineers; elevate design, coding, and operational standards across teams.
- Lead cross-team design reviews and ensure high-quality system thinking at scale.
- Raise the bar for engineering rigor, testing, and observability practices .
Requirements
- ~8-12+ years of experience building large-scale distributed systems or cloud platforms
- Proven track record of leading cross-team technical initiatives at scale
- Strong coding skills in Go, Python, or C++
- Deep expertise in Kubernetes at production scale , including orchestration, scheduling, and service design
- Strong understanding of networked systems, performance optimization, and system design
- Bachelor's degree in Computer Science , Engineering, or a related technical field
- Inference & Performance Expertise
- Hands-on experience with inference systems, including:
- batching and micro-batching strategies
- caching and memory optimization
- mixed precision (BF16/FP8)
- streaming token delivery
- Demonstrated ability to improve tail latency (P95/P99) and system reliability through metrics-driven engineering
- Contributions to inference frameworks such as vLLM, Triton, TensorRT-LLM, Ray Serve, TorchServe
- Experience with GPU systems and performance optimization (CUDA, NCCL, RDMA, NUMA, GPU interconnects)
- Experience leading multi-team or org-level initiatives
- Exposure to large-scale AI/ML infrastructure or hyperscale cloud environments
- To fulfill our obligation to protect client data, successful applicants offered employment with CoreWeave will be required to complete a basic criminal record check, conducted in compliance with GDPR. Employment offers are conditional upon receiving satisfactory check results
Benefits
Additional Information
CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at www.coreweave.com . We're proud to be a Living Wage accredited Employer.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at coreweaveu? Share your experience