Skip to main content
Back to jobs

Member of Technical Staff - Efficient ML

External
embedding-vc logoEmbedding-vc · San Francisco Bay Area
Full-timeOn-site5mo ago
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

Introducing Moonlake, AI for creating world simulations. Scope of Work Training efficiency Dataloaders, fusion, activation remat, gradient checkpointing. FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning. GPU + kernel performance Nsight profiling, Triton/CUDA kernels, fused ops. Flash-attention-style speedups, sequence packing, KV-cache tricks. Inference optimization Low-latency serving, continuous batching, speculative decoding. Quantization (GPTQ/AWQ), distillation, pruning. Infra + reliability SLURM/K8s multi-node jobs, checkpoint hygiene. Determinism, env pinning, GPU failure handling. We are committed to being an on-site, in-person team currently based in San Mateo


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at embedding-vc? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect