Skip to main content
Back to jobs

Data Scientist

External
forto logoForto · Berlin, Germany
Full-timeOn-site3d ago
API DesignClassificationForecastingLLMsMachine LearningMove
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

What if your work could drive change in a globally established industry, shaping processes that touch every corner of the world? At Forto, we are at the forefront of change, harnessing the power of AI to revolutionise logistics. We want to reinvent digital supply chains to be transparent, frictionless and sustainable. From day one, our mission has been to simplify global trade - creating a seamless and efficient logistics process. Your Role & Mission As a Data Scientist at Forto, you will take ownership of production ML systems that extract structured intelligence from unstructured logistics data. You will working closely with the Engineering Manager across three core workstreams - document data extraction (FlashDoc), vocabulary mapping, and classical ML. Your immediate priority is ensuring continuity of existing production systems and setting up evaluation pipelines, but equally important is driving step-change improvements in accuracy through disruptive methods and new technologies when the opportunity arises. Beyond document automation, the team's roadmap extends into traditional data science territory - demand forecasting, churn prediction, route optimization, and predictive analytics for logistics operations. You will bring both the ML engineering depth to maintain and innovate on current systems and the classical data science foundation to tackle these broader challenges as the team grows.

Responsibilities

  • Design, build, and maintain end-to-end ML pipelines for document extraction, classification, and data enrichment in production.
  • Build prompt evaluation frameworks and feedback-based optimization loops to systematically improve extraction accuracy.
  • Train custom in-house models using human-in-the-loop (HITL) data to move from assisted to fully automated extraction.
  • Build and maintain semantic similarity models for free-text to standardized TMS vocabulary across ports, terminals, container types, legal entities, and line items.
  • Improve pipeline reliability through redesign, testing, monitoring, and alerting for non-deterministic ML systems.
  • Evaluate and introduce disruptive approaches (new model architectures, fine-tuning strategies, novel evaluation methods) to achieve step-change accuracy improvements when incremental optimization plateaus.
  • Partner with Product Managers to identify where DS can solve real user pain points, proactively surface opportunities from the data, and shape product roadmaps with a data-informed perspective.
  • Collaborate closely with Engineering teams on integration, infrastructure, and API design to ensure DS outputs are consumed reliably by downstream systems.
  • Manage stakeholder expectations: communicate what is feasible given capacity, set realistic timelines, flag risks early, and negotiate prioritization trade-offs across teams.
  • Required Skills and Experience
  • 2+ years of professional experience in data science or machine learning engineering
  • Ability to design, deploy, and maintain ML systems in production. Go beyond model development - it includes pipeline architecture, monitoring, reliability, and handling non-deterministic outputs at scale.
  • Ability to quickly get onboarded with new tools/ technologies/ problem space
  • Strong use of agentic tools for coding
  • Strong proficiency in Python
  • Hands-on experience with LLMs (prompting, fine-tuning, evaluation) and understanding of their limitations in production environments.
  • Strong foundation in classical data science and statistics: regression, classification, time series analysis, data leakage, experimental design, and hypothesis testing.
  • Strong analytical and problem-solving skills with the ability to work independently on ambiguous, research-oriented problems.
  • Demonstrated ability to evaluate when existing approaches are insufficient and propose disruptive alternatives - not just incremental tuning.
  • Strong stakeholder management skills: ability to identify problems and opportunities proactively, manage expectations on timelines and feasibility, and negotiate prioritization across competing demands.
  • Ability to commit fully to a direction after healthy debate, even when it wasn't their preferred approach
  • Why work with us?

Benefits

Health insurance

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at forto? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect