Senior AI Researcher - Pre-training Data
One-Click ApplyYour profile and resume will be shared with the employer.
Prepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
As a Senior AI Researcher for Pre-training Data, you will shape and improve the underlying scientific methodology behind our pre-training corpora while also co-engineering the software and systems that enable this. Working with engineers and other researchers to build scalable pipelines, you will focus on relevant theoretical and empirical research required to understand which data makes models perform best on our targeted capabilities. This role is for you if you have a strong background in large-scale language modeling and the scientific drive to answer complex questions about data scaling laws, synthetic data generation, and curriculum learning. In your day-to-day, you will design targeted ablations across various scales, derive and test hypotheses from training dynamics, develop novel algorithms for estimating data quality and performing data curation, and contribute to a range of engineering tasks which facilitate these research directions. Together with a collaborative team of engineers and researchers, you will have a direct impact on the fundamental knowledge and capabilities of the models we ship. You will also help or lead the writing of technical reports for internal and external readers, as well as presenting at and contributing to technical meetings and conferences on an as-needed basis.