Member of Technical Staff - Efficient ML

External

Embedding-vc · San Francisco Bay Area

Full-timeOn-site5mo ago

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

Introducing Moonlake, AI for creating world simulations. Scope of Work Training efficiency Dataloaders, fusion, activation remat, gradient checkpointing. FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning. GPU + kernel performance Nsight profiling, Triton/CUDA kernels, fused ops. Flash-attention-style speedups, sequence packing, KV-cache tricks. Inference optimization Low-latency serving, continuous batching, speculative decoding. Quantization (GPTQ/AWQ), distillation, pruning. Infra + reliability SLURM/K8s multi-node jobs, checkpoint hygiene. Determinism, env pinning, GPU failure handling. We are committed to being an on-site, in-person team currently based in San Mateo

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at embedding-vc? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect