Site Reliability Engineer (SRE)

External

Hellokindred · Milton Keynes, UK

ContractOn-site3mo ago

API GatewayAWSAzureBashCI/CDCompliance

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Operate and enhance Kubernetes platforms across AWS, Azure, and on-premise environments.
Lead incident response, problem management, and root cause analysis activities.
Deliver cluster lifecycle management including upgrades, patching, node pool management, CNI and CSI configuration, ingress management, and Rancher operations.
Own observability strategy including dashboards, alerting, monitoring, and definition of SLOs and SLIs.
Implement GitOps practices using Fleet and reduce operational toil through automation and governance.
Apply secure API gateway and Web Application Firewall (WAF) patterns.
Design and support distributed systems including event brokers and asynchronous messaging architectures.
Maintain platform security posture including CVE remediation, GRC controls, and security scanning pipelines.
Provision and manage infrastructure using Terraform and Crossplane as orchestration layers.
Implement and maintain CI/CD pipelines using Concourse, GitHub Actions, and Azure DevOps.
Ensure compliance with PCI DSS and GDPR security patterns.
Deep expertise in Kubernetes, Rancher, GitOps, Linux, and cloud networking.
Strong experience operating in hybrid cloud environments across AWS, Azure, and on-premise platforms.
Strong automation and scripting skills in Python, Go, Bash, PowerShell, or .NET.
Proven experience with Infrastructure as Code using Terraform and Crossplane.
Experience implementing and managing observability tooling including Grafana, Prometheus, Jaeger or Tempo, CloudWatch, Loki, and OpenTelemetry.
Strong understanding of API gateway and Web Application Firewall patterns.
Experience working with distributed systems and event-driven architectures.
Experience operating within regulated environments including PCI DSS and GDPR.
Knowledge of service mesh technologies such as Istio or Kuma is desirable.
AWS operational experience is advantageous.
Experience within payments or other regulated industries is beneficial.
All your information will be kept confidential according to EEO guidelines.
Candidates must be legally authorized to live and work in the country where the position is based, without requiring employer sponsorship.
HelloKindred is committed to fair, transparent, and inclusive hiring practices. We assess candidates based on skills, experience, and role-related requirements.
We appreciate your interest in this opportunity. While we review every application carefully, only candidates selected for an interview will be contacted.

Benefits

Vision insurance

Additional Information

Anticipated Contract End Date/Length: August 28, 2026 Work Set Up: Hybrid (must be eligible for BPSS) Our client in the Information Technology and Services industry is looking for a Site Reliability Engineer (SRE) to support and enhance a complex, multi-cloud Kubernetes platform environment. This role is focused on driving platform reliability, automation, observability, and security across AWS, Azure, and on-premise infrastructure. The successful candidate will play a key role in improving uptime, reducing operational toil through GitOps and automation, strengthening platform security posture, and enabling scalable onboarding of new tenants and workloads. This is a hands-on engineering role operating within regulated environments and modern cloud-native architectures.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Hellokindred? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect