Data Pipeline Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Develop 15 data ingestion pipelines from heterogeneous sources (REST APIs, SFTP file drops, database extracts) → S3 → Glue ETL → Lake Formation - Implement ETL transformations per R2 data mapping specifications - Build cross-agency data sharing patterns: Agency B data → Central Platform (Lake Formation cross-account grants, resource links) - Implement data lineage tagging using OpenLineage / AWS-native lineage metadata for governance audit trail - Configure data quality checks for multi-source ingestion - handle schema drift, late-arriving data, source unavailability - Write and maintain IaC (CDK/Terraform) for R2 pipeline resources - Execute unit testing, integration testing, and cross-agency data access validation - Support UAT with Agency B data owners - validate data accuracy, timeliness, and access controls - Document pipeline configurations, source connectivity patterns, and data flow diagrams - Participate in daily stand-ups, sprint demos, and code reviews Required skills: AWS Glue, Lake Formation, S3, Athena, Python/PySpark, multi-source integration (APIs, SFTP, DB extracts), IaC, SQL
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at ONEBYZERO PTE. LTD.? Share your experience