PhD Research Internship - Robotics Engineer (VLM / VLA Models)

External

Sensmore · Berlin, Germany

Full-timeOn-site1mo ago

Deep LearningHugging FaceMachine LearningPythonPyTorchRobotics

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Depending on your expertise and project priorities, you will:
Research & Method Development
Design and develop novel approaches for Vision-Language-Action systems in real-world industrial settings
Explore scalable architectures for multi-modal reasoning and action generation
Contribute to advancing state-of-the-art methods in embodied AI and robotic autonomy
Multi-Modal Learning & Data Systems
Lead the design and analysis of large-scale multi-modal datasets (video, radar, lidar, sensor fusion)
Develop self-supervised or weakly supervised dataset generation pipelines for VLA training
Investigate data-centric approaches to improve robustness and generalization
Model Development & Optimization
Build, adapt, and extend cutting-edge GenAI models (e.g., VLMs, VLA frameworks)
Apply advanced fine-tuning strategies (e.g., parameter-efficient tuning, alignment methods)
Explore prompt optimization, reasoning augmentation, and action grounding techniques
Training, Evaluation & Benchmarking
Design rigorous evaluation protocols for embodied AI systems in industrial contexts
Run large-scale experiments, analyze performance, and iterate systematically
Benchmark models against state-of-the-art approaches and internal baselines
Deployment & Systems Integration
Collaborate with engineering teams to transition research prototypes into production-ready systems
Optimize models for real-time inference, robustness, and safety in heavy-industry environments
Scientific Contribution
Document findings and contribute to research publications, technical reports, or patents
Present results internally and potentially at leading conferences
Required Qualifications
Current enrollment in a PhD program in Robotics, Computer Science, Machine Learning, Electrical Engineering, or a related field
Strong programming skills in Python and deep learning frameworks (e.g., PyTorch)
Solid understanding of machine learning, deep learning, and multi-modal models
Proven ability to conduct independent research and drive projects from idea to results
Strong analytical thinking and problem-solving skills
Preferred Skills & Experience
Experience with Vision-Language Models , embodied AI , or robotics learning systems
Familiarity with modern GenAI tooling (e.g., Hugging Face ecosystem, Gemini, Unsloth, or similar)
Experience with multi-modal data (vision + sensor fusion)
Background in robotics, control systems, or real-world deployment
Track record of research output (publications, preprints, or significant research projects)
Experience with large-scale training, distributed systems, or model optimization
Research Environment & Outlook
Opportunity to work on high-impact, real-world robotics problems at the intersection of AI and industrial automation
Collaboration with a multidisciplinary team spanning AI research and robotics engineering
Potential to publish and contribute to the scientific community
Opportunity to shape long-term research directions and transition work into real-world deployment

Benefits

Build physical AI for the world's largest off-highway machinery - making them intelligent, safe, and ready for every tough taskJoin the pioneer in intelligent robotics backed by Point Nine & other Tier 1 investorsCombine cutting-edge robotics research in end-to-end learning & Vision Language Action Model with real-world heavy mobile equipmentTailor your own career path, whether you like to become technical specialist or technical team leadExperience a great team culture, beverages, and an amazing office environmentAttractive compensation package and stock options.Beverages on-sVision insuranceEquity / stock options

Additional Information

sensmore automates the world's largest machines with unprecedented intelligence. Our proprietary Physical AI enables heavy machines such as wheel loaders to instantly adapt to dynamic environments and execute new tasks without prior training. We integrate cutting-edge robotics into a platform powering intelligence and automation products - transforming productivity and safety for customers in mining, construction, and adjacent industries today. Join us and play a pivotal role in transforming the automation landscape in heavy industries. Role Overview We are seeking a highly motivated PhD candidate to join our team as a Research Intern specializing in General Purpose AI, with a focus on Vision-Language Models and Vision-Language-Action systems. This role sits at the frontier of industrial robotics: developing scalable, general-purpose VLA systems that enable robots to perceive, reason, and act autonomously in complex heavy-industry environments. You will contribute to bridging multi-modal perception (e.g., video, radar, lidar) with robust real-world execution, while advancing state-of-the-art methods in embodied AI. Beyond engineering, this position has a strong research component , with opportunities to contribute to novel methods, publish findings, and shape the future of industrial autonomy.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at sensmore? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect