Skip to main content
Back to jobs

Senior Engineer - Ingestion & Streaming Frameworks

External
datavant2 logoDatavant2 · Remote
Full-timeRemote1w ago
AirflowAWSAzureCI/CDGitHubIAM
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Design, build, and operate the ingestion frameworks that pull data from operational databases, vendor APIs, document streams, and third-party feeds into Snowflake, Iceberg, and Databricks
  • Own and evolve the ingestion stack (AWS DMS, MWAA / Airflow, Fivetran, and the homegrown tooling on top) and design new patterns for API sources that don't fit a managed connector
  • Build self-service tooling so product engineers can onboard new sources without becoming experts in our infrastructure
  • Write and review the Terraform behind our ingestion infrastructure: AWS networking, IAM, compute, and data services
  • Partner with product, data, and analytics teams to pick the right ingestion pattern for each source (CDC, batch, API, streaming) and stand it up end-to-end
  • Lead production troubleshooting and incident response, and turn each incident into a durable platform fix
  • Raise the bar on engineering quality, observability, cost discipline, and security in everything the team ships
  • Mentor mid-career engineers and pull peers along through code review, pairing, and design feedback

Requirements

  • 6+ years in data engineering, platform engineering, or data-focused software engineering
  • 3+ years of hands-on AWS with real strength in networking (VPC, subnets, routing, PrivateLink, security groups), IAM (roles, policies, permission boundaries), and the data services this role touches, plus the judgment to know when to reach for what
  • 2+ years writing production Terraform or equivalent IaC, with experience owning modules, reasoning about state and blast radius, and shipping infrastructure changes safely
  • 1+ years building self-service tooling, internal platforms, or paved-path frameworks consumed by other engineers
  • Strong SQL skills and the ability to reason about how data physically lives in a warehouse or lake
  • Production experience with Snowflake (or an equivalent cloud data warehouse) and a workflow orchestrator (Airflow / MWAA preferred)
  • Hands-on experience with at least one ingestion approach: CDC tooling (e.g., DMS, Debezium), managed connectors (e.g., Fivetran, Airbyte), or rolling your own pipelines for API sources
  • Solid CI/CD discipline in GitHub or equivalent: branching, code review, automated checks, repeatable deployment
  • AI-native working style: daily use of Claude Code, Cursor, Copilot, or equivalent, with views on how they make a team faster
  • Working knowledge of Python is expected; mastery isn't the bar
  • Clear written and verbal communication, especially in async, remote settings
  • What Helps You Stand Out:
  • Direct production experience with Iceberg or another open table format, especially bridging Snowflake and Databricks
  • Hands-on Databricks or Spark
  • Kubernetes experience
  • Snowflake certification(s)
  • Azure experience (we're primarily AWS, but our customers and acquisitions aren't always)
  • In-depth experience integrating data systems with managed identity platforms, particularly via SCIM (SailPoint a plus)
  • P

Benefits

Health insuranceRemote work options

Additional Information

Datavant is the data collaboration platform trusted for healthcare. Guided by our mission to make the world's health data secure, accessible and actionable, we provide critical data solutions for organizations across the healthcare ecosystem - including providers, health plans, researchers, and life sciences companies. From fulfilling a single patient's request for their medical records to powering the AI revolution in healthcare, Datavanters are building the future of how data is connected and used to improve health. By joining Datavant today, you're stepping onto a driven and highly collaborative team that is passionate about creating transformative change in healthcare. The Ingestion & Streaming team sits on our Data & Machine Learning Platform organization and owns the movement layer of Datavant's data platform: batch and streaming pipelines, change data capture, document intake, and the self-service frameworks product teams use to land new sources into Snowflake, our Iceberg-backed lakehouse, and Databricks. Most data moves into the platform; some moves back out. Our job is to make both safe, fast, observable, and boring. We are looking for a Senior Engineer who thinks like a platform builder first. We are shifting from a service-oriented posture ("we'll build the pipeline for you") to a platform-oriented one ("here is the paved path to build it yourself, safely"), though the team still builds pipelines directly when the situation calls for it. You will be central to that shift, designing the frameworks, tooling, and guardrails that scale how Datavant onboards new sources, and rolling up your sleeves for hands-on ingestion work when there isn't yet a paved path. AI fluency is a baseline expectation here. You should already be using Claude Code, Cursor, Copilot, or equivalent tools as a core part of your daily engineering workflow, have opinions about how they make a team faster, and know how to apply them responsibly when PHI and other sensitive data are in scope.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at datavant2? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect