Member of Technical Staff - Image / Video Generation
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Trains large-scale diffusion transformer models for image and video data, working at the scale where intuitions break and empirical evidence matters
- Rigorously ablates design choices-running experiments that isolate variables, control for confounds, and produce insights you can actually trust-then communicating those results to shape our research direction
- Reasons about the speed-quality tradeoffs of neural network architectures in production settings where both constraints matter simultaneously
- Fine-tunes diffusion models for specialized applications like image and video upscalers, inpainting/outpainting models, and other tasks where general-purpose models aren't enough
Requirements
- You likely have:
- Hands-on experience training large-scale diffusion models for image and video data, with practical knowledge of common failure modes and what matters most in training
- Experience fine-tuning diffusion models for specialized applications-upscalers, inpainting, outpainting, or other tasks where understanding the domain matters as much as understanding the architecture
- Deep understanding of how to effectively evaluate image and video generative models-knowing which metrics correlate with quality and which are just convenient proxies
- Strong proficiency in PyTorch, transformer architectures, and the full ecosystem of modern deep learning
- Solid understanding of distributed training techniques-FSDP, low precision training, model parallelism-because our models don't fit on one GPU and training decisions impact research outcomes
- We'd be especially excited if you:
- Have experience writing forward and backward Triton kernels and ensuring their correctness while considering floating point errors
- Bring proficiency with profiling, debugging, and optimizing single and multi-GPU operations using tools like Nsight or stack trace viewers
- Know the performance characteristics of different architectural choices at scale
- Have published research that contributed to how people think about generative models
- How We Work Together
- Everything we do is grounded in four values:
- Obsessed. We are a frontier research lab. The science has to be right, the understanding deep, the product beautiful.
- Low Ego. The work speaks. The best idea wins, no matter who said it. Credit is shared. Nobody is above any task.
- Bold. We take the ambitious bet. We ship, we do not wait for conditions to be perfect.
- Kind. People over politics. We treat each other with genuine warmth. Agency without empathy creates chaos.
- If this sounds like work you'd enjoy, we'd love to hear from you.
Benefits
Additional Information
About Black Forest Labs We're the team behind Latent Diffusion, Stable Diffusion, and FLUX-foundational technologies that changed how the world creates images and video. We're creating the generative models that power how people make images and video-tools used by millions of creators, developers, and businesses worldwide. Our FLUX models are among the most advanced in the world, and we're just getting started. Headquartered in Freiburg, Germany with a growing presence in San Francisco, we're scaling fast while staying true to what makes us different: research excellence, open science, and building technology that expands human creativity. Why This Role You'll train large-scale diffusion models for image and video generation, exploring new approaches while maintaining the rigor that helps us distinguish meaningful progress from incremental tweaks. This isn't about following established recipes-it's about running the experiments that clarify which architectural choices matter and which are less impactful.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at blackforestlabs? Share your experience