Senior Research Associate in Multimodal Learning
One-Click ApplyWe'll track this in your applications and open the company's page so you can finish applying.
Prepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
The role A strong and vibrant research team with steady-stream publications in high-calibre venues is looking for a postdoctoral researcher (36-months fixed-term) to develop novel approaches to multi-modal audio-visual understanding in conversational settings . The position is funded by the EU project WeHear, a collaboration led by Denmark Technical University (DTU). The project will focus on audio-visual understanding for smart hearing aid, with low latency (real-time and predictive) on smart glasses You will be working closely with Dima on her active research. Check Dima's research interests and projects at: http://dimadamen.github.io/ Prior expertise in audio-visual perception and deep learning methods with a strong publication track record is expected, including first-author publications in CVPR/ICCV/ECCV/ICASSP/PAMI/IJCV/NeurIPS/ICLR. What will you be doing? Over the period of 36 months, you will be: Conducting novel research in multimodal audio-visual understanding - contributing novel research on designing, training and evaluating audio-visual understanding in conversational setting. This will include hands-on research using the latest deep learning approaches. Preparing API packages with low latency that will be integrated with partner demonstrations on quarterly basis. Presenting your work in regular meetings, taking feedback and integrating the goals of the proect into your individual research directions. Publishing in top-tier venues (conferences and journals). Communicating your work to the best possible audience. Collaborating with other researchers (postdocs and faculty) in the WeHear project. Co-advising junior PGR students You should apply if PhD [near submission, submitted or graduated] in Multimodal Understanding, preferably with expertise in audio understanding, video understanding or multimodal visual models. Prior degree in computer science, engineering or mathematics Detailed knowledge of video understanding state-of-the-art, approaches, datasets and problems, preferably with expertise in egocentric datasets. Prior knowledge of egocentric audio-visual devices that work in real time like Meta Aria Glasses (Gen1 or Gen2) and Apple Vision Pro. Experience in handling audio-video data, for learning and inference Experience in modelling deep learning approaches Experience and evidence of publishing at high-calibre conferences and journals (at least one first-author paper in a major venue - CVPR/ICCV/ECCV/ICASSP/NeurIPs/PAMI/IJCV/Neurips/ICLR in the past 3 years). Excellent programming skills (Python) Proficiency in deep learning frameworks (PyTorch) Additional information For informal queries please contact: Gozde Burger, Senior Research Administrator Email: gozde.burger@bristol.ac.uk To find out more about what it's like to work in the Faculty of Engineering, and how the Faculty supports people to achieve their potential, please see our staff blog: https://engineering.blogs.bristol.ac.uk/category/engineering-includes-me/ Please familiarise yourself with the redeployment process guidance before submitting an application, particularly around contract types (e.g. Fixed-term appointments) Contract type: Open ended with fixed funding until 31/08/2029 This advert will close at 23:59 UK time on 08/07/2026 The interviews are anticipated to take place on 16/07/26 Our strategy and mission We recently launched our strategy to 2030 tying together our mission, vision and values. £43,482 to £50,253 per annum, Grade: J/Pathway 2
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at University of Bristol? Share your experience
Interested in this role?
One tap and your profile goes straight to the employer.
We'll track this in your applications and open the company's page so you can finish applying.