AI/ML Research Engineer, LLM Post-Training & Evaluation
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- As an AI/ML Research Engineer, LLM Training & Evaluation, you will design and implement the pipelines and tooling that connect data, evaluation, and post-training. You will help customers and internal teams move from evaluation findings to measurable model improvements.
- You will also contribute to Innodata's internal R&D efforts, including benchmark datasets, evaluation frameworks, and reusable infrastructure for model assessment and post-training experimentation. Additional responsibilities include (but are not limited to):
- Lead or co-lead technically complex ML engineering projects from initial customer discussions through implementation and delivery
- Design, build, and improve LLM training and post-training pipelines, including data ingestion, preprocessing, fine-tuning, evaluation, and experiment tracking
- Implement and optimize evaluation systems for LLMs and multimodal models, including offline benchmarks and task-specific test harnesses
- Integrate human-in-the-loop and AI-augmented evaluation signals into model development workflows
- Build robust infrastructure and tooling for reproducible experimentation, metrics logging, and regression monitoring
- Diagnose model behavior and pipeline failures, including data issues, training instability, metric inconsistencies, and evaluation drift
- Collaborate with Language Data Scientists and Applied Research Scientists to translate evaluation frameworks into executable systems
- Work closely with customer technical stakeholders to understand goals, constraints, and success criteria; propose and implement technically sound solutions
- Contribute to internal research and platform development, including benchmark frameworks, evaluation tooling, and post-training workflow improvements
- Contribute to best practices and standards for LLM training, evaluation, and quality assurance across projects
- Mentor junior engineers and contribute to technical design reviews, documentation, and engineering rigor across the team
- You'll Thrive in This Role If You Have:
- BS/MS/PhD in Computer Science, Machine Learning, AI, Applied Mathematics, or a related quantitative technical field (MS/PhD preferred)
- 2-3 years of relevant industry or research engineering experience in ML/AI systems
- Hands-on experience with LLM training / fine-tuning / post-training, including at least one of:
- supervised fine-tuning (SFT)
- preference optimization (e.g., DPO or related methods)
- RLHF / RLAIF-style workflows
- task- or domain-adaptation of foundation models
- Strong programming skills in Python and experience building production-quality ML code
- Experience with modern ML frameworks (e.g., PyTorch, JAX, TensorFlow) and model libraries/tooling (e.g., Hugging Face ecosystem, vLLM, distributed training stacks)
- Experience designing and implementing evaluation pipelines for LLM/ML systems, including metrics computation, dataset handling, and experiment comparisons
- Strong understanding of data pipelines and ML systems engineering, including reproduc
Additional Information
Innodata (Nasdaq: INOD) is a global data engineering company. We believe that data and Artificial Intelligence (AI) are inextricably linked. Our mission is to enable the responsible advancement of artificial intelligence by providing the data, evaluation frameworks, and human expertise required to build AI systems that can be trusted at scale. We provide a range of transferable solutions, platforms, and services for Generative AI / AI builders and adopters. In every relationship, we honor our 36+ year legacy delivering the highest quality data and outstanding outcomes for our customers. Scope of the Role: Innodata is expanding its team of technical experts in LLM training, post-training, and evaluation systems. As an AI/ML Research Engineer, LLM Training & Evaluation, you will build and optimize the technical foundations that power model improvement for foundation model builders and leading labs. This role is ideal for someone who has hands-on experience fine-tuning and evaluating large language models (and ideally multimodal models), and who can bridge research and engineering in real-world customer environments. You will work closely with Language Data Scientists, Applied Research Scientists, data engineers, and client technical stakeholders to design and implement robust training/evaluation pipelines using both human-in-the-loop and AI-augmented methods. The ideal candidate brings a strong computer science / machine learning engineering background, experience with modern LLM post-training workflows, and the ability to engage credibly with technical counterparts at leading AI organizations.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Innodata Inc.? Share your experience