Senior Data Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Senior Data Engineer with strong expertise across traditional big-data platforms (Hadoop ecosystem) and modern cloud-native architectures (AWS). Responsible for building scalable, secure, and high-performance data pipelines that span Hadoop clusters and AWS cloud services. Leverages deep knowledge of distributed systems, Spark optimization, cloud automation, and big-data management to support analytics, BI, ML, and AI use cases across the enterprise. Ensures reliability, governance, cost-efficiency, and operational excellence across hybrid data platforms. Associate should be self-driven, can work with minimal guidance and guide the team technically. Core Responsibilities Design, build, and maintain high-volume ETL/ELT pipelines across Hadoop (HDFS, Hive, Spark, Kafka) and AWS (Glue, EMR, Lambda, Step Functions, Redshift) . Develop distributed data processing solutions using PySpark, Spark SQL , and scalable cloud serverless patterns. Implement reusable data ingestion frameworks for batch (Sqoop, Hive, Spark) and streaming (Kafka, Kinesis). Optimize data workflows using partitioning, bucketing, compression, file formats (Parquet/ORC). Understanding hybrid data lake architectures using S3 + HDFS , ensuring governance consistency (Atlas, Ranger, Lake Formation). Understanding the reporting requirements and perform data profiling and create design for same. Create data flow diagram and do data modelling. Job orchestration using Airflow, Control-M, Step Functions , or event-driven triggers. Understand auto-scaling, capacity planning, and performance tuning on EMR and Spark clusters. Ensure data is protected and compliant with regulatory standards. Work closely with business stakeholders to enable high-quality datasets. Provide technical leadership in architecture decisions, code reviews, and best-practice adoption and provide technical guidance to peers/juniors in team. Improve reliability, scalability, and performance through automation, autoscaling, and capacity planning. Own deployment, incident response, and post-incident reviews for production environments, troubleshooting Spark performance issues, job failures, and cluster bottlenecks. Understanding security best practices (IAM, KMS, security groups, WAF, parameter/secret management). Optimize cost and usage of AWS resources and recommend architecture improvements. Collaborate closely with developers, QA, and product teams to streamline release processes.
Requirements
- Technical Skills
- Strong experience from 5-8 eyars with the Hadoop ecosystem (HDFS, Hive, Spark, YARN, Kafka).
- Strong hands-on expertise in Scala, PySpark , Spark optimization techniques, HiveQL, and distributed computing.
- Good work experience in SQL in hive and impala
- Good understanding of AWS data stack (S3, Glue, EMR, Lambda, Kinesis, Redshift, Step Functions).
- Proficiency in at least one scripting/programming language: Python, Shell scripting .
- Strong experience with CI/CD , GitHub, Git commands.
- Expertise in ETL and Data Warehousing and cloud concepts.
- Good understanding of data modelling (star/snowflake), partitioning strategies, and schema evolution.
- Expertise in data profiling and decision making.
- Able to understand, design and create data flow diagrams and do data modelling. (knowledge of Miro will be added advantage)
- Able to understand the architecture and design end-to-end data flow.
- Hands-on experience with Airflow, Control-M , or other orchestrators.
- To monitor and support BAU and year end activities, if needed.
- Well versed with security and compliance aspects in Cloud.
- Good understanding of AWS networking (VPC, subnets, routing, SGs, NACLs).
- Familiarity with serverless patterns and containerization (Docker, ECS/EKS).
- Experience with monitoring/logging tools and incident management practices.
- Other Requirements
- Strong logical and analytical, problem-solving, and communication skills.
- Communicate effectively and c
Additional Information
Our story At Alight, we believe a company's success starts with its people. At our core, we Champion People, help our colleagues Grow with Purpose and true to our name we encourage colleagues to "Be Alight." Our Values: Champion People - be empathetic and help create a place where everyone belongs. Grow with purpose - Be inspired by our higher calling of improving lives. Be Alight - act with integrity, be real and empower others. It's why we're so driven to connect passion with purpose. Alight helps clients gain a benefits advantage while building a healthy and financially secure workforce by unifying the benefits ecosystem across health, wealth, wellbeing, absence management and navigation. With a comprehensive total rewards package, continuing education and training, and tremendous potential with a growing global organization, Alight is the perfect place to put your passion to work. Join our team if you Champion People, want to Grow with Purpose through acting with integrity and if you embody the meaning of Be Alight. Role: Senior Data Engineer
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Alight Solutions LLC? Share your experience