Member of Technical Staff - Distributed Systems Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Benefits
Additional Information
Our Mission Reflection's mission is to build open superintelligence and make it accessible to all . We're developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond. Foundations Vision: Build and operate a company-wide foundations platform that accelerates every team by providing reliable, scalable developer infrastructure, SRE capabilities, and high-throughput data ingestion tooling enabling Reflection to move faster as we scale. What This Team Does Build and operate the core shared services that power our research, training, and production environments. These systems form the foundational platform that multiple teams depend on for model development, deployment, and evaluation, unifying data, compute, and workflow management across the stack while enabling rapid experimentation and reliable production systems. Build and operate shared services that multiple teams rely on across research and production workflows. Define and uphold reliability targets through SLIs, SLOs, and healthy on-call practices. Maintain strong operational readiness with runbooks, incident playbooks, and capacity planning. Ensure correctness and performance under load, addressing consistency, tail latency, and failure modes. Develop APIs, SDKs, and internal platforms that enable high-velocity experimentation and iteration. Reduce operational burden through better tooling, standardization, and platform patterns that scale across teams. What You'll Work With Container Abstractions: Containers-as-a-Service, Kubernetes abstraction layers, container orchestration, reproducible environments, multi-tenant isolation. Distributed Systems Architecture: Sharding, replication, coordination services, high-concurrency systems, concurrency control. Service Development Stack: gRPC, Protobuf, Go, Rust, C++. Reliability & Performance: Idempotency, retries, backpressure, SLI/SLO design, tail latency optimization, service reliability engineering. About You Strong software engineering background with experience shipping production-grade systems. Experience designing APIs, services, or developer platforms that handle large-scale data or compute. Comfortable navigating complex codebases, debugging hard problems, and optimizing for reliability and speed. Thrive in a high-agency, fast-paced startup environment; bias toward action and impact. Excited about zero to one challenges, building new systems rather than maintaining legacy ones. Collaborative, clear communicator, and comfortable working across research and infra boundaries. Motivated by creating the software backbone for the world's most capable open-weight AI systems.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at reflectionai? Share your experience