Computer Vision Researcher (VLM)

External

Niantic-spatial · London, UK

Full-timeRemote1mo ago

Computer VisionLLMsMachine LearningMentoringMoveNLP

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Architect Semantic Grounding: Lead research into cross-modal grounding that connects 3D spatial features with language embeddings, enabling the LGM to "understand" object relationships and environmental context.
Scale "Understand" Capabilities: Develop and deploy algorithms for continuous semantics , allowing our 3D maps to evolve and improve their situational awareness as new ground-level and aerial data is ingested.
Agentic Frameworks: Build the "spatial brain" for Embodied AI, enabling robots, Drones and other Machines to move beyond simple navigation to mission-level reasoning.
Multimodal Benchmarking: Define the standards for measuring "spatial common sense" in LLMs, creating evaluations that test a model's ability to interpret and operate within complex 3D scenes.
Technical Mentorship: Serve as the technical anchor for the London R&D hub, resolving architectural disagreements and mentoring the next generation of researchers in the fusion of 3D CV and NLP.
Collaborative Innovation: Partner with Product leads to ensure the "Understand" API delivers high business value for enterprise customers in robotics, logistics, and field operations.
Required Qualifications:
Education: PhD (or equivalent) in Computer Vision, Machine Learning, or Robotics with a focus on Multimodal/Semantic understanding.
Years of Experience: 4+ years of experience in ML research, with a proven track record of shipping models that bridge 3D Vision and Language .
Technical Depth: Expert knowledge of 3D Geometry (SfM, SLAM, VPS) and Transformer-based architectures (VLMs).
Research Impact: Multiple first-author publications at top-tier venues (CVPR, NeurIPS, ICLR) focusing on VLMs, scene understanding or semantic segmentation.
Implementation Mastery: Ability to write production-quality research code in PyTorch or JAX and manage large-scale data pipelines.
Required In-Office Days: 3 days per week
Plus If:
Experience with Gaussian Splatting or NeRFs for semantic scene representation.
Background in robotics (ROS) or building agentic systems that interact with physical environments.
Experience with "open-set" recognition and Zero-Shot learning.
Candidate Privacy Policy
I understand that by submitting my job application, the information I provide as part of that application will be used in accordance with Niantic Spatial's Privacy Notice for Job Applicants and Candidates .

Benefits

Health insuranceVision insurance

Additional Information

At Niantic Spatial, we're building the future of geospatial AI. Powered by a proprietary database of over 30 billion posed images and a groundbreaking third-generation digital map, our mission is to develop spatial intelligence that helps both humans and machines better understand, navigate, and engage with the physical world. Our high-fidelity mapping technology unlocks a new dimension of interaction-laying the foundation for AI to truly comprehend and operate within real-world environments. Join us as we build a living model of the world that people and machines can talk to. As a Computer Vision Researcher with experience in Large Language Models (LLMs), you will bridge the gap between 3D computer vision LLMs, creating a unified framework where machines can reason about their surroundings. By linking spatial geometry directly to language, you will enable our systems to perform context-aware navigation and answer complex, open-ended questions about the physical world.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at niantic-spatial? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect