Data Engineer (Mid‑Level ), Global

External

Vantagedc · London, UK

Full-timeHybridToday

AzureCI/CDDocumentationETLGitGitHub

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Develop and maintain PySpark notebooks and jobs to ingest, transform, and curate data within the enterprise data platform.
Build and modify Azure Data Factory pipelines for batch and incremental data ingestion.
Implement Spark‑based transformations that write curated datasets to Azure Data Lake Storage Gen2 using established folder structures and naming conventions.
Create and maintain SQL views and tables in Azure Synapse to support analytics and reporting use cases.
Respond to pipeline failures, data validation issues, and operational alerts.
Perform basic performance tuning of Spark jobs (e.g., partitioning, filtering, incremental logic) within established architectural patterns and standards.
Validate data outputs with business partners and address data defects or discrepancies.
Commit code using Git, follow branching standards, and participate in pull request reviews.
Update documentation for pipelines, datasets, and operational runbooks as changes are made.
Execute assigned backlog items within sprint timelines and raise risks or blockers early.
Additional duties as assigned by management
Job Requirements
Education & Experience
Bachelor's degree in Engineering, Computer Science, Data Analytics, or a related field, or equivalent experience.
Minimum of 3-5 years of experience in data engineering or analytics engineering.
Proficiency in Python for building and maintaining data pipelines, automation, and data processing workflows, including use of PySpark.
Proficiency in SQL for querying, transformation, and analytical data processing.
Solid understanding of ETL/ELT pipelines, data transformation patterns, and data integration concepts.
Experience analyzing enterprise data sources to identify data relationships, transformations, and business rules.
Experience building solutions on the Microsoft Azure platform with exposure to services such as Azure Data Factory, Azure Synapse, Azure Data Lake Storage Gen2, and related analytics services.
Experience working with source control and CI/CD workflows using tools such as GitHub or Azure DevOps.
Working

Benefits

Flexible schedule

Additional Information

About Vantage Data Centers Vantage Data Centers powers, cools, protects and connects the technology of the world's well-known hyperscalers, cloud providers and large enterprises. Developing and operating across North America, EMEA and Asia Pacific, Vantage has evolved data center design in innovative ways to deliver dramatic gains in reliability, efficiency and sustainability in flexible environments that can scale as quickly as the market demands. Position Overview This position will be based at our office in London in alignment with our flexible work policy. (3 days on site, 2 days from home). Vantage Data Centers is seeking a Mid‑Level Data Engineer to help build, operate, and scale our enterprise data platform. This role is designed for an engineer who can operate independently, execute reliably in a fast‑paced environment, and take ownership of data pipelines and datasets with minimal ramp‑up. As part of the Data Engineering & Business Intelligence team, you will be responsible for delivering production‑ready data solutions that support analytics, reporting, and emerging AI‑enabled use cases. You will work closely with senior data engineers and business partners, but this role assumes a self‑starter mindset with the ability to move from requirements to implementation without constant oversight. Success in this position requires comfort with ambiguity, strong execution discipline, and accountability for results. Essential Job Functions Design, build, and maintain reliable, scalable data pipelines using Python and PySpark on the Microsoft Azure data platform. Develop and operate batch and incremental data pipelines leveraging Azure Data Factory for orchestration and Azure Data Lake Storage Gen2 as the primary data store. Independently implement SQL- and Spark‑based transformations to produce curated datasets that support enterprise reporting, analytics, and downstream consumption. Take ownership of assigned data pipelines and datasets, including monitoring, troubleshooting, and performance optimization in production environments. Work with Azure Synapse (dedicated or serverless where applicable) to support analytical workloads and data consumption patterns. Collaborate with business analysts and cross‑functional stakeholders to translate data requirements into practical, working data solutions. Prepare and structure data to support advanced analytics and AI‑enabled use cases by ensuring data quality, consistency, and documentation. Apply established data governance, security, and engineering standards to ensure compliant, maintainable, and scalable solutions. Participate in code reviews, technical discussions, and platform improvement initiatives as an active contributor. Proactively identify data quality issues, pipeline risks, and improvement opportunities, and communicate them clearly in a fast‑paced environment.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at vantagedc? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect