Site Reliability Engineer

One-Click Apply

Snapp · Tehran, Tehrān, Iran, Islamic Republic Of

Full-timeOn-site6mo ago

PythonKubernetesRedis

Cover Letter Connect

We'll track this in your applications and open the company's page so you can finish applying.

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

In this role, you will strengthen the SRE Platform team's mission by advancing the foundational platforms that automate manual workflows and elevate system reliability. Your work will ensure our staging environments remain stable and production-like, empowering QA and development teams to test, validate, and deploy their applications with confidence. You will also contribute to operational excellence through active participation in the weekly on-call rotation, supporting consistent and dependable infrastructure performance. Automate and optimize operational processes Enhance and maintain the observability stack Oversee test/staging environments management Develop and support critical production components Handle and resolve production incidents Participate in the on-call rotation Strong teamwork and collaboration skills Solid understanding of SRE concepts, including SLIs, SLOs, SLAs, and Error Budgets Proficiency in Python or another scripting language Strong grasp of software engineering principles Hands-on experience with observability and monitoring tools such as Prometheus and Grafana Familiarity with logging stacks (e.g., ELK, Loki) and tracing systems (e.g., Jaeger, Tempo) Understanding of RDBMS and Redis Experience working with Kubernetes and related tooling (e.g., Helm)

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Snapp? Share your experience

Interested in this role?

One tap and your profile goes straight to the employer.

Cover Letter Connect

We'll track this in your applications and open the company's page so you can finish applying.