Data Engineer - AWS + Hadoop

External

Synechron · Bengaluru - Bellandur (gtp)

Full-timeOn-siteToday

AirflowAWSCI/CDCross-functional CollaborationData ModelingDocker

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Requirements

Required
Bachelor's degree in Computer Science, Engineering, Information Systems, Mathematics, or related field
or equivalent practical experience
Preferred
AWS or data engineering certifications
Ongoing learning in cloud data platforms, governance, and automation
Professional Competencies
Strong analytical and problem-solving skills
Clear communication and cross-functional collaboration
Effective time and priority management
Adaptability to evolving technologies and requirements
Focus on reliability, data quality, and continuous improvement
S YNECHRON'S DIVERSITY & INCLUSION STATEMENT
All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant's gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law .
Candidate Application Notice

Benefits

Health insuranceFlexible scheduleEquity / stock options

Additional Information

Job Summary Synechron is seeking a Data Engineer - AWS + Hadoop to build and optimize scalable data pipelines, data lake solutions, and distributed data platforms. This role supports analytics, machine learning, and reporting by delivering reliable, secure, and cost-efficient data solutions. Software Requirements Required AWS : S3, Glue, EMR, Athena, Lambda, Redshift, IAM, CloudWatch Hadoop ecosystem : HDFS, Hive, Spark, Kafka, Oozie and/or Airflow Spark with PySpark and/or Scala SQL , Python or Scala , Shell scripting Kafka and/or Kinesis Airflow and/or AWS Step Functions Git , Docker CI/CD using Jenkins or GitHub Actions Experience with data modeling, partitioning, metadata, and data quality checks Knowledge of security and governance including IAM, encryption, RBAC, and PII handling Preferred Lake Formation Curated data APIs or analytics views Cost optimization and advanced observability practices Overall Responsibilities Design and implement ETL/ELT pipelines for batch and streaming workloads Build ingestion frameworks using Kafka/Kinesis and Spark Develop and optimize AWS-based data lakes and warehouses Manage Hadoop ecosystem tools and job orchestration Implement data quality, governance, and access controls Monitor pipelines and improve cost, performance, and reliability Collaborate with analytics, ML, and BI teams to deliver curated datasets Participate in code reviews, documentation, and engineering standards Technical Skills (By Category) Programming Languages Essential: SQL, Python and/or Scala, Shell scripting Preferred: Advanced PySpark optimization Databases / Data Management Essential: Data modeling, schema design, partitioning, metadata management, Redshift, Hive Preferred: Curated data services and advanced cataloging Cloud Technologies Essential: AWS data services including S3, Glue, EMR, Athena, Lambda, Redshift, IAM, CloudWatch Preferred: Lake Formation and cost optimization strategies Frameworks and Libraries Essential: Spark, Kafka/Kinesis, Hadoop ecosystem tools Preferred: Structured Streaming and reusable ingestion frameworks Development Tools and Methodologies Essential: Git, Docker, CI/CD, Airflow or Step Functions, code reviews, monitoring Preferred: Automated testing for data pipelines Security Protocols Essential: IAM, encryption, RBAC, PII handling, secure access controls Preferred: Fine-grained governance and audit readiness practices Experience Requirements 7+ years in Data Engineering or related roles Experience with large-scale distributed data systems Strong hands-on background in AWS data services and Hadoop ecosystem tools Experience with batch and streaming pipelines, SQL tuning, and production support Equivalent related experience will also be considered Day-to-Day Activities Build and maintain batch and streaming pipelines Optimize Spark jobs, SQL queries, and storage patterns Monitor job health, logs, metrics, and data quality Troubleshoot issues and implement preventive fixes Work with analytics, ML, BI, and engineering teams Join planning, design reviews, code reviews, and release activities

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at synechron? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect