Design, build, test, and deploy Big Data solutions at scale, including data lakes, data warehouses, and real-time analytics.
Extract, clean, transform, and analyze vast amounts of raw data from various data sources.
Build robust data pipelines and API integrations with various internal systems.
Work across data operations (BAU), data ingestion, storage, processing.
Implement best practices in data governance, security, and compliance
Estimate effort, identify risks, and plan execution effectively for the assigned tasks.
Proactively monitor, identify, and escalate issues or root causes of systemic issues.
Collaborate across functions (Data Science, Data Operations, Product development) to deliver data solutions.
Identify technical risks and ensure timely delivery of assigned tasks.
Organise own work/tasks within the scope of the assigned workstream to deliver high quality solutions on time.
Requirements
Essential:
Bachelor's degree in Computer Science, Engineering, Statistics or a related field
6+ years of data engineering experience, with at least 3 years at enterprise scale projects.
4+ years of experience in Big Data technologies (e.g., Spark, Hive, Hadoop, Databricks, etc.).
Strong experience designing and implementing data pipelines.
Excellent knowledge of data engineering concepts and best practices.
Ability to collaborate with peers and provide guidance to Jnr. Engineers withing the assigned workstream.
Strong attention to detail and adherence to best practices.
Good communication and collaboration skills
Demonstrate excellent analytical and problem-solving skills.
Advanced proficiency with Apache Spark, including PySpark and SparkSQL, tuning and performance optimisation experience is fundamental.
Proficiency in Python, Pandas(Scala/Java knowledge is desirable).
Working knowledge of Apache Hive.
Strong SQL knowledge and experience (T-SQL, working with SQL Server, SSMS).
Expertise in designing and implementing scalable data pipelines and ETL processes using the GCP data stack, including BigQuery, Dataflow, Pub/Sub, Cloud Storage, Cloud Composer, Cloud Functions, Dataproc (Spark).
Experience building and managing ETL workflows using Apache Airflow, including DAG creation, scheduling, and error handling.
Experience in designing solutions using batch data processing methods, real-time streams, ETL processes, and business intelligence tools.
Experience designing logical data models and physical data models,
Knowledge of Delta Lake concepts and common data formats, Lakehouse architecture.
Unit testing, integration testing, and test-driven development (TDD).
Performance optimization and scalability considerations.
Desirable :
Experience with streaming services such as Kafka is a plus.
R & Sparklyr experience is a plus.
Relevant certifications (e.g., Google Cloud Professional Data Engineer).
Impact You'll Make:
TransUnion - a place to grow:
We know that it is unrealistic to expect candidates to have each and every aspect of the essential and/or desirable skills listed above - if there is something you can't tick off right now - good, you can learn here!
Impact you will make:
Enable Decision Making across the organization using data driven culture.
What's In It for you?
At TransUnion you will be joining a friendly, forward thinking global business.
As well as a competitive salary & bonus scheme our benefits package includes up 26 days' annual leave (plus bank holidays) a
Benefits
Remote work optionsPerformance bonus
Additional Information
TransUnion's Job Applicant Privacy Notice
What We'll Bring:
What We'll Bring:
About TransUnion:
TransUnion is a global information and insights company which provides solutions that help create economic opportunity, great experiences, and personal empowerment for hundreds of millions of people in more than 30 countries. We call this Information for Good®.
TransUnion is a leading credit reference agency, and we offer specialist services in fraud, identity, and risk management, automated decisioning and demographics. We support organizations across a wide variety of sectors including finance, retail, telecommunications, utilities, gaming, government, and insurance.