Member of Technical Staff - Efficient ML
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Introducing Moonlake, AI for creating world simulations. Scope of Work Training efficiency Dataloaders, fusion, activation remat, gradient checkpointing. FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning. GPU + kernel performance Nsight profiling, Triton/CUDA kernels, fused ops. Flash-attention-style speedups, sequence packing, KV-cache tricks. Inference optimization Low-latency serving, continuous batching, speculative decoding. Quantization (GPTQ/AWQ), distillation, pruning. Infra + reliability SLURM/K8s multi-node jobs, checkpoint hygiene. Determinism, env pinning, GPU failure handling. We are committed to being an on-site, in-person team currently based in San Mateo
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at embedding-vc? Share your experience