Applied Research Intern

External

Labelbox · San Francisco Bay Area

Full-timeOn-site9mo ago

Deep LearningLLMsMachine LearningMovePythonPyTorch

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Build and own evaluation and benchmark suites for reasoning, code, agents, long‑context, and V/LLMs.
Create post‑training datasets at scale: design preference/critique pipelines (human + synthetic), and target hard failures surfaced by evals.
Experiment and prototype RLHF/RLAIF/RLVR/RM/DPO‑style training loops to improve real-world task and agent performance.
Land research in product: ship improvements into Labelbox workflows, services, and customer‑facing evaluation/quality features; quantify impact with customer and internal metrics.
Engage with customer research teams: run pilots, co‑design benchmarks, and share practical findings through internal research reports, blog posts, talks, and published papers.
What You Bring
A strong foundation in AI and machine learning, backed by a Ph.D. or Master's degree in Computer Science, Machine Learning, AI, or a related field (in progress degrees are acceptable for intern positions).
A deep understanding of frontier autoregressive and diffusion multimodal models, along with the human and synthetic data strategies needed to optimize them.
Passion and experience for LLM evaluation and benchmarking.
Expertise in training data quality construction, measurement and refinement.
The ability to bridge research and application by interpreting new findings and translating them into functional prototypes.
A track record of publishing in top-tier AI/ML conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL) and contributing to the broader research community.
Proficiency in Python and experience with deep learning frameworks like PyTorch, JAX, or TensorFlow.
Exceptional communication and collaboration skills.
Applied Research at Labelbox
Annual base salary range
$35 - $45 USD
Life at Labelbox
Location : Join our dedicated tech hubs in San Francisco or

Benefits

Vision insuranceFlexible scheduleEquity / stock options

Additional Information

Shape the Future of AI At Labelbox, we're building the critical infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental to AI development, and our work becomes even more essential as AI capabilities expand exponentially. About Labelbox We're the only company offering three integrated solutions for frontier AI development: Enterprise Platform & Tools : Advanced annotation tools, workflow automation, and quality control systems that enable teams to produce high-quality training data at scale Frontier Data Labeling Service : Specialized data labeling through Alignerr, leveraging subject matter experts for next-generation AI models Expert Marketplace : Connecting AI teams with highly skilled annotators and domain experts for flexible scaling Why Join Us High-Impact Environment : We operate like an early-stage startup, focusing on impact over process. You'll take on expanded responsibilities quickly, with career growth directly tied to your contributions. Technical Excellence : Work at the cutting edge of AI development, collaborating with industry leaders and shaping the future of artificial intelligence. Innovation at Speed : We celebrate those who take ownership, move fast, and deliver impact. Our environment rewards high agency and rapid execution. Continuous Growth : Every role requires continuous learning and evolution. You'll be surrounded by curious minds solving complex problems at the frontier of AI. Clear Ownership : You'll know exactly what you're responsible for and have the autonomy to execute. We empower people to drive results through clear ownership and metrics. Role Overview As an Applied Research intern at Labelbox, you will design, build, and productionize evaluation and post‑training systems for frontier LLMs and multimodal models. You'll own continuous, high-quality evals and benchmarks (reasoning, code, agent/tool‑use, long‑context, vision‑language, et al.), create and curate post‑training datasets (human + synthetic), and prototype RLHF/RLAIF/RLVR/RM/DPO‑style training loops to measure and improve real‑world task and agent performance.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at labelbox? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect