Senior Data Engineer
External$140K–$160K/yrFull-timeOn-site2d ago
AirflowApacheCI/CDComplianceGitMLOps
Prepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Design and implement lakehouse architecture using Delta Lake, including medallion pipeline patterns (Bronze/Silver/Gold), schema enforcement, and time travel
- Build and operate batch and real-time ingestion pipelines leveraging Databricks Auto Loader, Structured Streaming, and Change Data Capture patterns
- Implement data governance and security using Unity Catalog, RBAC, and
- compliance-driven practices for sensitive environments
- Optimize performance and manage costs through FinOps strategies, including cluster sizing, workload tagging, Spark tuning, and Photon acceleration
- Design, implement, and maintain CI/CD pipelines and orchestration workflows using Databricks Workflows, Delta Live Tables, and tools such as Airflow
- Collaborate with Data Science teams on ML workflows, including MLflow,
- feature store integration, and model lifecycle management
- Ensure data quality, observability, and lineage across media-specific datasets
- such as streaming logs, ad impressions, and audience metrics
- Provide technical mentorship through code reviews, pairing and knowledge sharing
- Bachelor's degree in Computer Science, Data Engineering, or equivalent practical experience
- 5+ years of experience building production-grade data pipelines in cloud environments using Spark-based platforms (e.g., Databricks, EMR, Dataproc, open-source Spark)
- Expertise in PySpark, SQL, and Spark-based data processing, with experience operating pipelines at scale in production
- Hands-on experience building batch or streaming production data pipelines using distributed processing frameworks (e.g., Spark, Flink) and query engines such as Presto
- Proficiency with orchestration tools such as Apache Airflow or Dagster, with hands-on experience in CI/CD, monitoring, alerting, and data quality for production systems
- Experience working with modern data architectures, including event-driven and distributed systems
- Proficiency with Git and collaborative development workflows
- Build and operate batch and real-time ingestion pipelines using Spark based batch and streaming patterns (e.g., Structured Streaming, CDC), with experience on Databricks or comparable platforms
- Solid understanding of infrastructure, networking, and data security fundamentals
Requirements
- Experience building Lakehouse platforms and medallion pipelines in Databricks
- Familiarity with Unity Catalog, data governance, and compliance frameworks (e.g., PCI)
- Hands-on experience with CI/CD pipelines, orchestration tools, and infrastructure-as-code
- Experience with Lakeflow Spark Declarative Pipelines (SDP), MLflow, feature stores, and MLOps practices
- Background in media and entertainment data (e.g., video metadata, ad tech, audience analytics)
- Experience building data platforms within the media industry, with a strong
- understanding of audience analytics
- Experience working with large-scale analytical datasets (e.g., event logs, clickstream data, audience metrics)
- Comfortable using AI-assisted development tools (e.g., ChatGPT)
- Databricks or cloud certifications (e.g., Databricks Certified Data Engineer)
- What Sets You Apart
- Passion for clean, reliable, and scalable data systems
- Strong communication and collaboration skills
- Ability to balance near-term delivery with thoughtful technical design
- Curiosity and a growth mindset
- A team-first attitude who supports and uplifts others
- Additional Information
- Hybrid: This position has been designated as hybrid, generally contributing from the office a minimum of three days per week from our NYC office.
- Salary: $140,000 - $160,000. This position is eligible for company sponsored benefits, including medical, dental and vision insurance, 401(k), paid leave, tuition reimbursement, and a variety of other benefits and perks.
- If you are a qualified individual with a disability or a disabled veteran and require support throughout the application and/or recrui
Benefits
Dental insuranceVision insurance401(k)
Additional Information
The Data Engineering team is seeking a Senior Data Engineer to help design, build, and scale the modern data platform that powers analytics, data science, and data products across Versant brands. In this role, you'll collaborate closely with Data Product, Data Science, Analytics, and Engineering teams to deliver reliable, high-impact data solutions used by hundreds of internal users. We're looking for engineers who enjoy hands-on development, take ownership of production systems, and influence implementation through collaboration and technical expertise.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Versant3? Share your experience