Senior AI/ML Research Engineer (Computer Vision)
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Requirements
- MS or PhD in CS, EE, Robotics, or a related field, with 5+ years of applied computer-vision research experience.
- Strong grasp of modern CV and deep-learning fundamentals: CNNs and vision transformers, segmentation, detection, tracking, and representation/self-supervised learning.
- Demonstrated work in video understanding, including temporal action segmentation, action/phase recognition, and video segmentation.
- Hands-on experience with modern video architectures, including video transformers and self-supervised video pretraining.
- Exposure to vision-action (VA) / vision-language-action (VLA) models and world-model / self-supervised predictive architectures (e.g., JEPA-style models, MAE, DINO) for learning visual representations and dynamics.
- Experience working with large, messy, real-world video datasets at scale.
- Strong software and experimentation skills in Python and C++, with proficiency in one or more of PyTorch/TensorFlow/JAX, and the ability to stand up clean, reproducible experiments and run the full loop (data curation, augmentation, loss design, metrics, error analysis).
- A research-and-prototyping mindset: comfortable working in ambiguity, framing open-ended problems, running rapid experiments, and reading and reproducing recent papers to pull promising techniques into practice.
- Sound judgment about the path from prototype to product: writing code others can build on, knowing when to optimize versus when to move fast, and thinking ahead about data quality, evaluation, and robustness even at the research stage.
- Solid foundations in linear algebra, probability, and optimization, enough to reason about and debug model behavior from first principles.
- Comfort collaborating across a multidisciplinary team (ML, robotics, software, and clinical/domain experts) and communicating tradeoffs and findings clearly.
- Background in healthcare, medical devices, surgical robotics, or other regulated technical domains.
- Sim-to-real workflows and experience with robotics simulators (e.g., NVIDIA Isaac)
- Experience with structured, ontology- or taxonomy-based labeling frameworks for fine-grained activity.
- Multimodal fusion of video with sensor, telemetry, and system-log streams.
- Designing annotation pipelines, QC processes, and active-learning loops.
- Real-time / edge inference optimization (e.g., TensorRT, NVIDIA Jetson).
- Fine-grained interaction and object-relationship modeling.
- Relevant peer-reviewed publications (CVPR, ICCV, ECCV, NeurIPS, etc.).
- Due to the nature of our business and the role, please note that Intuitive and/or your customer(s) may require that you show current proof of vaccination against certain diseases including COVID-19.
Benefits
Additional Information
Primary Function of Position We are building advanced augmented dexterity capabilities for next-generation robotic platforms. As a Senior AI/ML Research Engineer (Computer Vision), you will develop the perception models that let our Embodied-AI system understand the surgical scene. Working within a hierarchical, multimodal stack-where a high-level model interprets sensory observations into structured intent and a low-level policy turns that intent into precise, safe, real-time control-you will focus on the vision layer: designing, training, and evaluating models that extract anatomy, instruments, actions, and surgical context from intraoperative video. You will partner with the broader AI/ML team to define how perception feeds reasoning and control, and you will drive the research-to-deployment path for your models, taking them from offline experimentation to robust, real-time performance in the OR. Working within Intuitive's Future Forward research organization, you will identify, build and finetune the AI/ML models and algorithms that enables us to deliver safe and performant embodied AI systems. This role calls for someone who is equally comfortable getting hands-on with models and data and designing systems that scale. Roles and Responsibilities Develop temporal models for activity and workflow understanding: event/state recognition and fine-grained temporal action segmentation. Benchmark in-house models against the state of the art and recommend the target perception architecture. Define the perception input/output specification and demonstrate offline feasibility on recorded data. Stand up a continuous-improvement loop (discrepancy flagging, active learning, human-in-the-loop relabeling) and the tooling/UI needed for offline evaluation and the path to real-time use. Partner with annotation and data teams to shape label taxonomies, QC, and the data pipeline that feeds the AI/ML models. Establish the path from offline evaluation on recorded data to real-time integration, including the continuous-improvement (human-in-the-loop) data loop. Partner with AI/ML researchers, robotics, data engineers, and other stakeholders to deliver a perception layer that enables rapid prototyping and learning while working toward a product solution.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Intuitive? Share your experience