Skip to main content
Back to jobs

Member of Technical Staff, Perception

External
XDOF logoXdof · San Mateo Hybrid
Full-timeRemote3d ago
Computer VisionDeep LearningPythonRoboticsROS
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Requirements

  • 5+ years of industry experience in robot perception or computer vision
  • Strong 3D vision fundamentals: stereo and structured-light camera principles, 3D reconstruction
  • Proficiency with SLAM frameworks (ORB-SLAM, VINS-Mono, FastLIO, etc.) or V-SLAM system development experience
  • Hands-on engineering experience with human pose estimation: hand joints (MediaPipe, MANO) or full-body pose (OpenPose, SMPLify, etc.)
  • Proficient in deep learning training frameworks for perception model training, tuning, and evaluation
  • TensorRT deployment experience with real-time inference optimization on embedded platforms (Jetson, Horizon, etc.)
  • CUDA programming fundamentals; ability to write or debug custom kernels
  • Proficient in C++ and Python with ROS / ROS2 development experience
  • Proficient with AI coding agents
  • Engineering experience with 6DoF object pose estimation (FoundPose, FoundationPose, GDR-Net, etc.)
  • Familiarity with 3D Gaussian Splatting or NeRF for scene reconstruction or data augmentation
  • Experience with robot manipulation or teleoperation systems
  • End-to-end development experience with automated annotation pipelines or ground truth generation systems
  • Published research in perception, pose estimation, or robotics

Benefits

Direct involvement in the most critical technical challenge in embodied intelligence: producing high-quality robot training dataAn environment working alongside top-tier robotics engineers and ML researchersProprietary hardware platforms (humanoid robots, camera arrays, data gloves)A fast-paced, high-autonomy 0→1 work environmentVision insurance

Additional Information

At XDOF, we're at an inflection point. Frontier labs are racing to build general-purpose robots, and high-quality training data is the bottleneck. We're building the foundation behind the foundation models - the data collection systems, operational capability, exabyte-scale data warehouse, and software toolchain - to help our partners drive the field forward. The Perception Algorithm team transforms raw multimodal sensor data into high-quality robot training annotations. You will be deeply involved in the complete loop from data collection to model delivery - sensor calibration, SLAM localization, human pose estimation, perception model training, and embedded deployment. Your work directly determines the quality ceiling of our training data. Core Responsibilities Human Pose Estimation Design and optimize hand pose estimation pipelines supporting accurate joint angle extraction from teleoperation data collection Build full-body pose estimation systems for motion capture and teleoperation action annotation ground truth generation Research and apply vision-based pose estimation methods (markerless) to reduce data collection costs Fuse pose estimation outputs with robot joint angle data to generate consistent training annotations Robot Perception & Calibration Design and maintain intrinsic/extrinsic calibration pipelines for multi-camera arrays (factory calibration + online recalibration) Build visual SLAM / V-SLAM systems supporting real-time localization and scene reconstruction on data collection platforms Implement hand-eye calibration between cameras and robot end-effectors Develop temporal alignment solutions across multimodal sensors (cameras, IMU, data gloves, force sensors) Perception Model Training & Deployment Train and iterate on perception models including object detection, instance segmentation, and 6DoF pose estimation Optimize model inference using TensorRT / CUDA for real-time performance on robot embedded platforms Write custom CUDA kernels for low-level acceleration of perception tasks Design evaluation metric frameworks for perception models; continuously track the relationship between model performance and data quality End-to-End Loop from Data Collection to Model Delivery Contribute to the design of automated annotation pipelines that convert sensor data into structured training labels Build Auto QA modules to filter low-quality data including anomalous frames, failed demonstrations, and sensor dropouts Collaborate with ML engineers and data infrastructure teams to ensure perception output formats meet downstream VLA model training requirements Establish feedback mechanisms linking perception accuracy to model training outcomes, continuously improving annotation quality


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at XDOF? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect