Senior Applied Scientist - Multimodal

External

Flawless · London, UK

Full-timeRemote3d ago

Computer VisionDeep LearningOpenCVPythonPyTorch

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Model Development & Training
Develop repeatable, scalable audio/video dataset curation pipelines and lip sync model training workflows across multiple datasets
Train, fine-tune, and manage audio/video and lip sync model variants as model dependencies, data, and architectures evolve
Incorporate new datasets and model updates as they become available
Evaluation, Metrics & Quality
Design, automate, and maintain audio/video datasets and lip sync metric testing pipelines
Generate new quantitative and qualitative metrics to evaluate audio/video and lip sync quality
Produce comparisons, visualizations, and analyses to inform research and product decisions
Collaboration & Support
Partner closely with audio/video and lip sync researchers to support ongoing and future research initiatives
Validate audio/video and lip sync quality to improve out-of-the-box approval rates and reduce downstream cost and iteration time
Collaborate with Science, Engineering, and Product teams to align research outputs with company goals

Requirements

MSc OR PhD + Industry experience working in the domains of Audio processing, 3D Computer Vision, Speech Synthesis, Computer Graphics, or other multimodal related fields such as text/audio, or audio/visual.
Proficiency in Python, with a strong foundation in computer science and problem-solving.
Expertise with deep learning frameworks (PyTorch) and vision tools (OpenCV).
A strong product mindset - motivated by building systems that deliver tangible value to users, not just technical novelty.
Comfortable working at both the algorithmic and implementation levels, from model design and optimisation to large-scale data processing and integration in production systems.
High degree of proficiency in math and statistical methods for signal processing
Experience with audio-visual learning, multimodal fusion, and/or audio-driven face animation
Experience with speech processing and detection, such as dialog/speaker detection, speaker separation, and speech synthesis with deep neural networks
Outstanding communication skills for collaboration with scientists, research/ML engineers, and VFX artists
Bonus points for:
Demonstrable research experience with a strong publication record in major 3D Computer Vision, Speech Processing, and Computer Graphics venues and journals (e.g., CVPR, SIGGRAPH, NeurIPS)
Experience developing multi-modal systems that integrate audio, text, and visual inputs.
Experience working with cross-functional teams
Experience with generative and cross-domain attention models for audio/visual-based speech applications
Interview Process:
At Flawless, our team and interview process want to help you show your best self. We'll dive into past projects and simulate working together.
Our interview process is three rounds with some casual Zoom (or in-person) coffee in between to get to know each other:
Recruiting Screen: 30-45 minute call with our recruiting team (We want to discuss your interests and motivations as well as the practical details and make sure that Flawless would be a good fit for you)
Hiring Manager Screen: 45-60 minute
Skills Interviews: A take home task to assess your coding ability and design decisions, this will be followed by a conversation to discuss your work and how it could be improved.
Team Interview: 2 hours onsite Interview where you will meet variety of your potential future colleagues. We will review your coding solution, discuss relevant papers and their application

Benefits

Vision insurancePerformance bonus

Additional Information

"The AI company that's revolutionizing Hollywood" Flawless is transforming Hollywood with assistive AI. Our tools empower filmmakers to edit, localize, and refine performances while preserving artistic intent. Designed to support, not replace, artists, our technology expands what is possible on screen and gives creators freedom to tell stories with greater impact and reach audiences in new ways. From enabling seamless multilingual releases to eliminating the need for costly reshoots, Flawless solves critical challenges that slow down productions and limit distribution. We are also setting the standard for ethical AI in entertainment. Our Artistic Rights Treasury (A.R.T.) is a rights management solution that protects artists and rights holders, ensuring that innovation moves forward with transparency and respect for creative ownership. Reports to: Akin Caliskan

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at flawless? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect