Senior Bioinformatics Data Engineer (Consultant)

External

Propharmagroup · US

Full-timeRemoteToday

AirflowApacheAWSCI/CDCloudFormationCompliance

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Build and maintain Dagster-orchestrated ingestion pipelines for genomics vendors (Caris, Predicine, Tempus, Olink, CellCarta), including IO managers, Iceberg writers, and row-level accounting.
Develop and harden dbt Silver-to-Gold transformations: real-data test coverage, store-failures patterns, staging/intermediate/mart models, and macro consolidation.

Requirements

Education: Bachelor's or master's degree in computer science, Data Engineering, Bioinformatics, or related field.

Experience: 5+ years of professional experience in data engineering with shipped production pipelines on AWS (S3, ECS/Fargate, Redshift or equivalent MPP).

Strong proficiency in Python and SQL with working knowledge of modern data engineering libraries.

Advanced proficiency with dbt and a workflow orchestration tool (Dagster, Airflow, or Prefect).

Data quality instinct: track record of catching silent failures, questioning data correctness assumptions, and noticing lossy joins or incomplete deliveries.

Solid understanding of lakehouse architecture patterns, ETL processes, and schema design for complex multi-modal datasets.

Ability to handle PHI-adjacent clinical data under Incyte's contractor policy (background check, compliance training, VPN access).

Willingness to work within legacy codebases (R, PySpark) to extract business rules and validate new implementations.

Excellent communication skills and ability to work in an embedded pair model with tight feedback loops.

Direct experience with Apache Iceberg, AWS Glue Catalog, or lakehouse table formats.

Comfort reading genomic data (VAF, HGVS nomenclature, VCFs, CNV/fusion semantics) or demonstrated ability to ramp on unfamiliar scientific domains quickly.

Familiarity with clinical data standards including SDTM, ADaM, and CDISC.

Pharma, clinical research, or life sciences background.

Experience with containerization (Docker/ECS) and infrastructure-as-code (CloudFormation).

Proficiency in R for interoperability with bioinformatics teams.

All applications to roles at ProPharma are personally reviewed by a member of our recruitment team. We do not rely on AI screening tools to support our hiring process. You will always receive an outcome to your application so that you have an answer from us - whether you're successful or not.

Whilst ProPharma supports remote working, we also recognise the value that comes from in person collaboration. As such, we encourage any new hires that are based within a reasonably short commute of one of our offices to work on a hybrid basis and spend some time working from t

Senior Bioinformatics Data Engineer (Consultant)

Responsibilities

Requirements

Benefits

Additional Information

Your Match

Company Intel

What employees say

Interested in this role?