Skip to main content
Back to jobs

Computer Vision Researcher (VLM)

External
niantic-spatial logoNiantic-spatial · London, UK
Full-timeRemote1mo ago
Computer VisionLLMsMachine LearningMentoringMoveNLP
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Architect Semantic Grounding: Lead research into cross-modal grounding that connects 3D spatial features with language embeddings, enabling the LGM to "understand" object relationships and environmental context.
  • Scale "Understand" Capabilities: Develop and deploy algorithms for continuous semantics , allowing our 3D maps to evolve and improve their situational awareness as new ground-level and aerial data is ingested.
  • Agentic Frameworks: Build the "spatial brain" for Embodied AI, enabling robots, Drones and other Machines to move beyond simple navigation to mission-level reasoning.
  • Multimodal Benchmarking: Define the standards for measuring "spatial common sense" in LLMs, creating evaluations that test a model's ability to interpret and operate within complex 3D scenes.
  • Technical Mentorship: Serve as the technical anchor for the London R&D hub, resolving architectural disagreements and mentoring the next generation of researchers in the fusion of 3D CV and NLP.
  • Collaborative Innovation: Partner with Product leads to ensure the "Understand" API delivers high business value for enterprise customers in robotics, logistics, and field operations.
  • Required Qualifications:
  • Education: PhD (or equivalent) in Computer Vision, Machine Learning, or Robotics with a focus on Multimodal/Semantic understanding.
  • Years of Experience: 4+ years of experience in ML research, with a proven track record of shipping models that bridge 3D Vision and Language .
  • Technical Depth: Expert knowledge of 3D Geometry (SfM, SLAM, VPS) and Transformer-based architectures (VLMs).
  • Research Impact: Multiple first-author publications at top-tier venues (CVPR, NeurIPS, ICLR) focusing on VLMs, scene understanding or semantic segmentation.
  • Implementation Mastery: Ability to write production-quality research code in PyTorch or JAX and manage large-scale data pipelines.
  • Required In-Office Days: 3 days per week
  • Plus If:
  • Experience with Gaussian Splatting or NeRFs for semantic scene representation.
  • Background in robotics (ROS) or building agentic systems that interact with physical environments.
  • Experience with "open-set" recognition and Zero-Shot learning.
  • Candidate Privacy Policy
  • I understand that by submitting my job application, the information I provide as part of that application will be used in accordance with Niantic Spatial's Privacy Notice for Job Applicants and Candidates .

Benefits

Health insuranceVision insurance

Additional Information

At Niantic Spatial, we're building the future of geospatial AI. Powered by a proprietary database of over 30 billion posed images and a groundbreaking third-generation digital map, our mission is to develop spatial intelligence that helps both humans and machines better understand, navigate, and engage with the physical world. Our high-fidelity mapping technology unlocks a new dimension of interaction-laying the foundation for AI to truly comprehend and operate within real-world environments. Join us as we build a living model of the world that people and machines can talk to. As a Computer Vision Researcher with experience in Large Language Models (LLMs), you will bridge the gap between 3D computer vision LLMs, creating a unified framework where machines can reason about their surroundings. By linking spatial geometry directly to language, you will enable our systems to perform context-aware navigation and answer complex, open-ended questions about the physical world.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at niantic-spatial? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect