Data Engineer (Big Data) - Core Data Platform Team

External

Teads1 · Montpellier

Full-timeOn-site3w ago

AccessibilityAirflowApacheAWSBigQueryCore Data

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Requirements

Production experience with a lakehouse table format (Apache Iceberg or Delta Lake). The specific format matters less than the underlying concepts: you understand how the lake stores data on object storage, how a ta

Benefits

Health insurance

Additional Information

About Teads Teads is a leading omnichannel advertising platform focused on driving outcomes for brand and performance advertisers across screens. With a focus on meaningful business outcomes for branding and performance objectives, Teads drives value by leveraging predictive AI technology to connect quality media, beautiful brand creative, and context-driven addressability and measurement. Teads is directly partnered with more than 10,000 publishers and 20,000 advertisers globally. The company is headquartered in New York, New York with a global team of around 1,700 people in 30+ countries. For more information, visit www.teads.com . Our main Engineering challenges at Teads Build efficient and easy-to-use web products used by thousands of users working for the world's most premium publishers, advertisers, and agencies. Rich and diverse tech stack and system architecture to optimize for performance, scalability, resiliency , and cost efficiency. We use mostly Scala and TypeScript, among others. Working in a very high-traffic environment (2.2 billion users per month, 100 billion events per day) with low latency and high availability constraints (2 million requests per second, responses in less than 150 milliseconds). Management of large datasets with milliseconds order of magnitude access time, to compute in a near real-time complex auction resolution algorithm (18 million predictions per second). A fast-changing environment where we continuously collaborate with Product teams and constantly adapt our Cloud infrastructure for new features and Products . Bring a wide diversity of profiles to the same level of quality and knowledge Work in an international environment with offices located in Israel, Slovenia and France. Our Core Data Platform team We're a leading force in the ad tech industry, revolutionizing how brands connect with their audiences. Our platform processes billions of ad impressions daily, generating massive datasets that drive our core business. We thrive on innovation and seek a Data Engineer to help us build and scale the data infrastructure that powers our insights and analytics. This is a unique opportunity to work with cutting-edge technologies and make a direct impact on our products. What will you do? As a Data Engineer , you'll be a key part of our data platform team, responsible for designing, building, and maintaining robust and scalable data pipelines. You'll work closely with data scientists, analysts, and server side engineers to ensure our data is reliable, accessible, and ready for analysis. Your expertise will be crucial in expanding our data warehouse and data lake capabilities, enabling us to deliver next-generation ad tech solutions. Your mission will be to: Develop and Optimize Data Pipelines: Design, build, and maintain ETL/ELT pipelines using Apache Spark to ingest, process, and transform large-scale datasets from various sources. Manage Cloud Infrastructure: Architect and manage our data infrastructure primarily on Google Cloud Platform (GCP) or Amazon Web Services (AWS) . This includes services like BigQuery, S3, GCS, EMR, and AirFlow. Enhance Data Storage: Improve and manage our data warehouse and data lake solutions, ensuring data quality, consistency, and accessibility for business intelligence and machine learning applications. Collaborate and Innovate: Partner with cross-functional teams to understand data needs and implement solutions that support new product features and business initiatives. Ensure Data Integrity: Implement monitoring, alerting, and logging systems to maintain data pipeline health and ensure data accuracy. What will you bring to the team? 5+ years of data engineering experience , building and operating production data pipelines at scale (TB+ datasets, hourly/daily batch or streaming workloads). Hands-on production experience with Apache Spark (batch, streaming, or both). You should be able to walk through a non-trivial Spark job you wrote, explain partitioning and shuffle behavior, and describe how you tuned it. Language is not a filter: Scala, Python, or Java are all fine. What matters is that you can debug and ship production Spark code, not which language you write it in. Production experience on GCP or AWS for at least one full project lifecycle (design, build, deploy, operate). Dataproc/EMR for compute, GCS/S3 for storage, BigQuery/Redshift/... for warehousing. GCP is preferred given our current stack, but strong AWS candidates are welcome and will ramp on GCP. Production experience with Kafka or a Kafka-compatible streaming platform. Our pipelines rely on streaming, so this is a core requirement. You have debugged a real production incident involving consumer lag, rebalancing, or data loss.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at teads1? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect