Skip to main content
Back to jobs

AI Engineer (Evaluation)

External
NANYANG TECHNOLOGICAL UNIVERSITY logoNanyang Technological University · Nanyang Technological University, Singapore
S$60K–S$84K/yrContractUnknown3d ago
Deep LearningDocumentationGitLLMsPythonPyTorch
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Requirements

  • Graduate from a degree program in computer science, AI, data science or related fields (or equivalent practical experience).
  • Deep understanding of LLM evaluation experimental design and the advantages/limitations of various LLM evaluation methods.
  • Experience in LLM inference frameworks such as vLLM and AI/Deep Learning frameworks such as PyTorch.
  • Writing production level code in Python and using version control systems such as Git.
  • Strong written and verbal communication skills.
  • Independent learner that is capable of reading and understanding research papers.
  • Fluent in English and one other Southeast Asian language for the purposes of understanding how to build quality evaluations from a multilingual and multicultural perspective.
  • We regret that only shortlisted candidates will be notified.

Additional Information

AI Singapore (AISG) is Singapore's national programme in artificial intelligence, launched by the National Research Foundation (NRF) to anchor deep national capabilities in AI. Hosted at Nanyang Technological University (NTU), AI Singapore brings together Singapore-based research institutions and the vibrant ecosystem of AI start-ups and companies to perform use-inspired research, grow knowledge, create tools, and develop the talent to power Singapore's AI efforts. Since our inception in 2017, we have established a culture of respect, continuous learning, experimentation and curiosity, centred around innovation. The candidate will join a team of AI scientists, apprentices, data and software engineers. With the team, he or she will be responsible for building evaluations that test the limits of AI models especially in terms of its multilingual, multicultural and multimodal capabilities. Duties and Responsibilities: Develop and maintain evaluation frameworks and pipelines to measure the capabilities of Large Language Models (LLMs). Keep up to date and experiment with the latest research in multilingual, multicultural, and multimodal LLM evaluations such as the LLM-as-a-Judge paradigm. Work with partners to collect, translate and verify evaluation datasets. Perform the necessary data preparation and analysis, AI modelling, coding, testing, validation and deployment to ensure reliable and scalable AI solutions. Collaborate with cross-functional teams within AI Products to design and resolve issues. Maintain code repository and documentation standards. Contribute to community engagement activities such as sharings via technical session meet-ups and article write-ups, and participating in discussion forums.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at NANYANG TECHNOLOGICAL UNIVERSITY? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect