Senior Data Engineer

External

Zocalohealth · Worldwide

$160K–$180K/yrFull-timeRemoteToday

AirflowAWSdbtDocumentationObservabilityPySpark

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

Zócalo Health is a tech-enabled, community-oriented primary care organization serving people who have historically been underserved by the one-size-fits-all healthcare system. We partner with health plans, providers, and community organizations to deliver culturally competent primary care, behavioral health, and social care. Our model is built for populations with high medical and social complexity, where fragmented care drives poor outcomes and unnecessary cost. We combine local, community-based teams with virtual care and modern technology to deliver coordinated, whole-person care where members live and receive support. Founded in 2021, Zócalo Health is backed by leading healthcare and mission-aligned investors and is scaling rapidly across states and populations. We are building a durable care platform designed to perform in constrained healthcare environments and to lead the shift toward accountable, value-based care. Role Description The Senior Data Engineer will join Zócalo Health as we build the data platform that powers analytics, product measurement, and operational visibility across the company. This is a hands-on building role at a foundational stage: you will design and ship the pipelines, ingestion frameworks, and data models that the rest of the company depends on. The primary focus of this role is establishing a scalable, durable data platform. This includes laying the groundwork for longer-term initiatives such as the longitudinal patient record, population-level analytics, and product instrumentation. You will partner closely with Engineering and Product to ensure the data platform supports roadmap priorities and outcome measurement as the company grows. This position reports to the Principal Data Engineer and partners closely with Engineering and Product. In your first 12 months, you will: Build and operate production-grade ingestion pipelines from core clinical, operational, and third-party systems into our Databricks lakehouse Develop and maintain dbt models that turn raw data into clean, well-documented, analytics-ready datasets Establish data quality, testing, and monitoring practices that make pipelines reliable and trustworthy Help shape ingestion patterns and architecture standards alongside the Principal Data Engineer Enable company-wide metrics for care outcomes and operations Collaborate with cross-functional leads to develop and iterate on a suite of core operational dashboards, ensuring teams have the self-service tools they need to track company metrics and outcomes. The Senior Data Engineer will contribute in the following ways: Design, build, and operate production data pipelines across clinical, operational, and third-party systems using API-based ingestion, Change Data Capture (CDC), and event- or webhook-driven patterns Build and maintain transformation layers in dbt, including tests, documentation, and reusable models Develop and refine core analytical and longitudinal data models used across the company Implement testing, monitoring, and observability to ensure data quality, pipeline reliability, and system performance Apply strong engineering fundamentals to improve the scalability, performance, and cost-efficiency of data systems on AWS and Databricks Partner with Product to support metric definitions, outcome measurement, and reporting needs Contribute to engineering standards, code review, and a culture of knowledge sharing and continuous improvement Partner with business, product, and engineering stakeholders to design and build intuitive data visualizations and dashboards that drive actionable insights and program visibility. Core Technologies (current and planned) Cloud: AWS Lakehouse / data platform: Databricks Transformations: dbt Languages: SQL and Python (primary languages for ingestion and transformation) Ingestion patterns: API-based ingestion, Change Data Capture (CDC), and event- or webhook-driven pipelines, including frameworks such as PySpark and Spark Structured Streaming on Databricks Orchestration: workflow orchestration (e.g., Databricks Workflows or Airflow)

Requirements

5+ years of experience in data or backend engineering roles with significant data platform responsibility
Hands-on experience building and operating production-grade data pipelines and ingestion frameworks
Strong proficiency in SQL and Python for data ingestion, processing, and transformation
Experience with a cloud data platform; experience with AWS and Databricks (or a comparable Spark-based lakehouse) strongly preferred
Experience building SQL-based transformation workflows; hands-on experience with dbt preferred
Strong computer science fundamentals, including comfort reasoning about distributed systems and data processing at scale
Ability to diagnose and resolve performance, reliability, and data quality issues in complex sys

Benefits

Health insuranceRemote work options

Additional Information

Senior Data Engineer at Zócalo Health Remote (Full Time) Compensation: $160,000 - $180,000 (per year)

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at zocalohealth? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect