Member of Technical Staff, Infrastructure Engineer
ExternalFull-timeOn-site3mo ago
Core DataDeep LearningDockerKubernetesPythonRobotics
Prepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Odyssey is an AI lab pioneering general-purpose world models: causal, multimodal systems that learn to predict and interact with the world over long horizons, while generating real-time, interactive simulations from any starting point. This foundational technology promises to revolutionize robotics, science, healthcare, education, gaming, defense, and beyond.
Responsibilities
- Develop and operate our low-latency model inference platform, ensuring high availability, scalability, and efficient resource utilization for Odyssey's world models.
- Engineer and scale our core data processing infrastructure (e.g., Flyte, Ray with k8s) to handle petabyte-scale datasets.
- Design, build, and maintain our large-scale, GPU-based training clusters for deep learning, focusing on usability, high throughput and reliability.
- Automate infrastructure provisioning, configuration, monitoring, and alerting using Infrastructure as Code (IaC) principles.
- Drive performance tuning, cost optimization, and reliability improvements across the entire stack.
- Collaborate closely with researchers and product developers to understand their requirements, optimize their workflows, and improve platform usability.
Requirements
- Motivated by building for the frontier: you want to shape the compute and infrastructure foundation of a lab redefining how people create and interact with media.
- Strong programming skills (e.g., Python, Go, or similar) and a solid understanding of software engineering best practices.
- Deep, hands-on experience with containerization (e.g., Docker), container orchestration (Kubernetes) and Infrastructure as Code (Terraform).
- Proven experience building and managing large-scale, distributed systems with GPU computational workloads (e.g., compute platforms, data pipelines, or high-availability services).
- Experienced in designing infrastructure for ML workloads where performance, parallelism, and data movement are critical.
- A collaborative mindset and excellent communication skills, with a passion for building developer-friendly platforms.
Benefits
Health insuranceVision insurance
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at odysseyml? Share your experience