Skip to main content
Back to jobs

NLP Data Scientist

External
Quantexa logoQuantexa · London, UK
Full-timeOn-site4d ago
ClassificationHugging FaceLLMsMachine LearningNLPNumPy
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Requirements

  • Degree in computer science, computational linguistics, artificial intelligence, mathematics, physics, engineering, or related field
  • Familiarity with NLP and machine learning proven through personal, university or work projects
  • Experience building or adapting machine learning models end-to-end, with the ability to explain the decisions made
  • Proficiency in Python (or similarly complex programming language), which is the primary programming language used
  • Willingness to learn about unfamiliar languages and scripts
  • Master's degree in relevant field
  • Familiarity with the scientific and ML-related Python stack: NumPy, SciPy, scikit-learn, TensorFlow, PyTorch, PySpark, Hugging Face Transformers, spaCy
  • Familiarity working with LLMs, LLM APIs, agentic systems
  • Experience in working with multiple languages and scripts
  • Experience with integrating the results of research into user-facing products
  • About Quantexa
  • We're made up of people from 47 nationalities who speak over 20 languages, and nearly half of our colleagues come from an ethnic or religious minority background.
  • If this sounds like a place you'd want to work, we'd love to hear from you!
  • Our perks and quirks.
  • What makes you Q will help you to realize your full potential, flourish and enjoy what you do, while being recognized and rewarded with our broad range of benefits.

Benefits

Competitive salary & Company bonus25 days annual leave (with the option of buying up to 5 days, and rolling over up to 10), plus national holidays + your birthday off!Pension scheme with a company contribution of 6% (when you contribute 3%)Private Healthcare, including dental & optic coverLife Insurance and Income ProtectionRegularly bench-marked salary ratesEnhanced Maternity, Paternity, Adoption, or Shared Parental LeaveWell-being daysVolunteer Day offWork from Home EquipmentCommuter, Tech and cycle to work schemesFree Subscription to a meditation, relaxation and sleep appContinuous Training and DevelopmentSpend up to 2 months working outside of your country of employment over a rolling 12-month period with our 'Work from Anywhere' policyTeam Social Budget & Company-wide SocialsOur mission.It's all about you.We want you to feel welcome, valued, and respected - because it's your individuality and passionHealth insuranceDental insuranceVision insurancePerformance bonusParental leave

Additional Information

We're hiring an NLP Data Scientist to work on challenging NLP problems that underpin Quantexa's entity resolution product: messy and multilingual text, creating training data from scratch, and careful evaluation. The role sits in our NLP Centre of Excellence, which builds NLP capabilities for Quantexa's products - an increasingly important area as the platform is applied to unstructured text. Our work ranges from shipping production models for tasks like named entity recognition, relation extraction, and text classification, through to more exploratory research that drives improvements in our products and the wider field. We work on the full spectrum of model complexity - from fast, explainable models, to fine-tuned small language models to agentic LLM-based systems. As a NLP Data Scientist, you'll develop these NLP components end-to-end from framing the problem and building training and testing datasets, training and evaluating models, handover for production, and improving accuracy based on feedback. Near-term, this is focused on building lightweight, fast NLP models where careful design of training data and evaluation benchmarks is critical. You'll be involved in: Translating underspecified business problems into well-scoped ML tasks Training, evaluating, improving NLP models end-to-end through reproducible pipelines Sourcing, building, curating training and evaluation datasets from heterogeneous, often imperfect inputs Working across languages and scripts, such as Arabic, Japanese, Chinese, and others Developing and using LLM-assisted workflows for dataset generation, error analysis, and model improvement iterations Collaborating with engineering and product teams to turn ideas into integrated product features; presenting findings to stakeholders


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Quantexa? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect