MLOps Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
White Circle is an AI Safety company building the safety, reliability, and optimization layer for AI systems. At the core of our platform are policies - simple natural-language rules that define what an AI model should and shouldn't do. We automatically test, enforce, and continuously improve these policies at scale. We've raised $11M from top funds, founders, and senior leaders at OpenAI, Anthropic, HuggingFace, Mistral, DeepMind, Datadog, Sentry, and others We process over 100M+ API calls every month We fine-tune and train our own LLMs so they run faster and cheaper than any open or proprietary model We're a small, highly focused team. If you want to work deeply on hard problems, see your work ship to production quickly, and influence how AI safety is actually built - you're the one we need. You will: Integrate new text and multimodal models into our serving paths and verify they behave correctly under production-like traffic. Build and maintain rollout pipelines for frequent model releases. Create smoke, quality, and performance gates for model promotion. Operate local and cluster GPU deployments on Kubernetes. Build dashboards for latency, throughput, queue depth, GPU usage, fallback rate, and quality drift. Run A/B and canary rollouts for model, prompt, routing, and serving config changes. Debug production issues across model config, tokenizer, serving API, router, queue, Kubernetes, GPU runtime, and CI jobs. Optimize serving cost and reliability across mixed GPU capacity.
Requirements
- Experience with an inference serving engine such as SGLang, vLLM, Dynamo, or TensorRT-LLM, and a working understanding of the request lifecycle through gateway, router, frontend, worker, queue, and model engine.
- Solid Kubernetes GPU experience: NVIDIA device plugin, GPU scheduling, resource requests/limits, node affinity, taints, tolerations, and node pools.
- Understanding of multi-node communication libraries and kernels, CUDA runtime, and container runtime compatibility, and the ability to debug across those layers.
- Ability to design and implement CI/CD for model serving: image and config versioning, smoke tests, quality regression tests against benchmarks, latency/throughput gates, canary rollout, and rollback.
- Production debugging across the whole stack from Rust to k8s configs.
- Clear communication of engineering tradeoffs.
- Rust backend experience.
- NCCL, UCX, NVSHMEM, RDMA, InfiniBand, RoCE, or EFA.
- ClickStack / Datadog.
- Terraform for GPU infrastructure.
- DCGM exporter, Prometheus, OpenTelemetry.
- Experience with a high model rollout cadence (2-3 releases per week).
- Why White Circle
- Paid time off in line with your local regulations, no matter where you work from
- Work from Paris (hybrid) with a relocation package available, or work from London (note: we are unable to provide relocation support for London-based roles)
- Comprehensive medical insurance for our France-based team (please note that we are in the process of setting up our UK office and therefore cannot offer medical insurance for London-based roles yet)
- All the hardware, tools, and services you need
- Covered subscriptions for AI agents and IDEs
- Team off-sites twice a year: we've recently been to the Alps and to Saint-Tropez
- How we hire
- Introductory call with HR (25 min)
- Take-home test task
- Technical interview with Head of Applied Research (60 min)
- Final conversation with our CEO (45 min)
Additional Information
TLDR: We're looking for an MLOps Engineer to sit at the boundary between Research and Production. You'll own the infrastructure that takes a trained model and makes it production-safe: rollout pipelines, quality and latency gates, canary deployments, and the dashboards that decide whether a release ships or rolls back.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at whitecircle? Share your experience