Site Reliability Engineer (SRE) - AWS + Docker

External

Synechron · Bengaluru - Bellandur (gtp)

Full-timeOn-siteToday

AWSAzureBashCapacity PlanningCI/CDCloudFormation

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Requirements

Required
Bachelor's degree in Computer Science, Engineering, Information Technology, or related field
or equivalent practical experience
Preferred
AWS, Kubernetes, Terraform, or cloud operations certifications
Ongoing learning in reliability engineering, security, and performance optimization
Professional Competencies
Strong analytical and problem-solving skills
Clear communication and effective documentation
Collaboration across engineering, QA, and security teams
Ability to prioritize operational work and planned improvements
Adaptability in production and incident-driven environments
Focus on reliability, efficiency, and continuous improvement
S YNECHRON'S DIVERSITY & INCLUSION STATEMENT

Benefits

Health insuranceEquity / stock options

Additional Information

Job Summary Synechron is seeking a Site Reliability Engineer (SRE) to improve the reliability, scalability, and performance of cloud-native systems. This role supports production operations through AWS infrastructure management, containerized workload operations, CI/CD enablement, observability, and incident response. The position contributes to business goals by improving availability, reducing operational risk, and supporting cost-efficient system performance. Software Requirements Required AWS : strong hands-on experience with EC2, ECS/EKS, IAM, VPC, ALB/NLB, Route 53, S3, CloudWatch Docker Container orchestration using EKS/Kubernetes or ECS CI/CD using GitHub Actions , Jenkins , or Azure DevOps IaC using Terraform or CloudFormation Observability tools: CloudWatch , Prometheus/Grafana , ELK/OpenSearch , X-Ray Automation using Python and/or Bash Linux system administration and troubleshooting Networking knowledge covering DNS, TCP/IP, TLS, security groups, NACLs Preferred Experience with CloudFront , RDS , ElastiCache , ASG Blue/green and canary deployment strategies Artifact management and release approval workflows Vulnerability scanning and secrets management tools Overall Responsibilities Define and maintain SLOs, SLIs, SLAs , and error budgets Build and manage AWS infrastructure for scalable, highly available systems Operate containerized services using Docker and ECS/EKS/Kubernetes Implement and optimize CI/CD pipelines and deployment strategies Establish observability through metrics, logs, and traces Automate infrastructure and operations using IaC and scripting Manage incident response, runbooks, root-cause analysis, and remediation Drive performance tuning, capacity planning, and cost optimization Implement security best practices across infrastructure and deployments Partner with development teams to improve reliability by design Technical Skills (By Category) Programming Languages Essential: Python, Bash Preferred: Scripting for operational automation and diagnostics Databases / Data Management Essential: Operational familiarity with RDS and ElastiCache in production environments Preferred: Performance tuning and availability planning for managed data services Cloud Technologies Essential: AWS including EC2, ECS/EKS, IAM, VPC, ALB/NLB, Route 53, S3, CloudWatch Preferred: CloudFront, Auto Scaling Groups, advanced cost optimization practices Frameworks and Libraries Essential: Docker, Kubernetes/EKS or ECS Preferred: Reliability patterns such as circuit breakers, retries, backoff, health checks Development Tools and Methodologies Essential: CI/CD, Terraform or CloudFormation, monitoring and alerting, incident response, Linux troubleshooting Preferred: Blue/green and canary deployments, release engineering improvements Security Protocols Essential: Least-privilege IAM, SSL/TLS, secrets handling, vulnerability awareness Preferred: Automated scanning, policy enforcement, and remediation workflows Experience Requirements 7+ years of experience in SRE, DevOps, or Cloud Operations Experience owning production infrastructure and reliability outcomes Strong experience with AWS, Docker, orchestration, CI/CD, IaC, and incident response Experience improving MTTR, availability, and operational efficiency Equivalent experience in related production engineering roles will also be considered Day-to-Day Activities Maintain AWS environments and containerized services Monitor system health, alerts, logs, and traces Improve deployment pipelines and release reliability Participate in incident response, troubleshooting, and postmortems Update runbooks, dashboards, and automation scripts Work with Dev, QA, and Security teams on resilience and operational readiness Join standups, planning sessions, reviews, and reliability discussions

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at synechron? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect