As our Site Reliability Engineer, As our Site Reliability Engineer , you'll contribute to the reliability, monitoring and operational excellence of cloud-native platforms.
You'll work closely with senior engineers to support production systems, implement SRE practices, and ensure services are observable, scalable and resilient. You'll also participate in the 24/7 support and on-call rotation , gaining experience in incident response and platform operations.
You'll also be:
Supporting the operation of AWS-based Kubernetes platforms (EKS)
Contributing to monitoring, alerting and observability implementations using tools like Grafana and Prometheus
Assisting in incident management, troubleshooting and root cause analysis
Participating in on-call rotations and production support activities
Implementing infrastructure changes using Terraform and GitOps workflows
Supporting CI/CD pipelines (GitLab, Argo CD) and deployment processes
Helping improve system reliability through automation and operational improvements
Following SRE practices such as runbooks, documentation and post-incident reviews
Working with DevOps and engineering teams to improve system performance and stability
Ensuring solutions align with security, compliance and operational standards
The skills you'll need
We're looking for an engineer with solid foundational experience in cloud platforms and a keen interest in reliability engineering and production operations.
You'll also need:
Experience working with AWS and Kubernetes (EKS) in a production or pre-production environment
Familiarity with monitoring and observability tools such as Grafana and Prometheus
Understanding of CI/CD pipelines and Git-based workflows (GitLab preferred)
Exposure to Terraform or infrastructure-as-code concepts
Basic understanding of SRE practices and production support models
Experience troubleshooting applications or infrastructure issues
Awareness of networking and security fundamentals in cloud environments
Willingness to participate in on-call rotations and incident response
Strong problem-solving mindset and eagerness to learn
Good communication and collaboration skills
Hours
45
Job Posting Closing Date:
16/06/2026
Additional Information
Join us as a Site Reliability Engineer
In this key role, you'll support the improvement of non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and services
You'll enjoy significant stakeholder interaction, working in collaboration with engineers to ensure a principled approach to deliver change in a safe and secure way
This is a chance to join an inclusive team with a collaborative ethos and a commitment to innovation and professional development
We're offering this role at associate level