Staff Site Reliability Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
We're looking for a Staff Site Reliability Engineer to help define and build the reliability foundation for Thrive Market's platform. You'll be working with a first-class group of engineers to establish our SRE practice from the ground up; defining SLOs, SLIs and Error Budgets, building observability into everything we do, and creating the frameworks that ensure our systems scale reliably during our company's rapid growth. This is a high-impact role at an exciting inflection point. We've recently containerized our entire platform on Kubernetes, and we're evaluating a potential platform migration to a next-generation ecommerce platform. You'll be balancing hands-on reliability work with the strategic thinking needed to build systems that self-heal and get better over time. If you've read books like The Google SRE Handbook, The Phoenix Project, Accelerate, The DevOps Handbook, etc., this is the right place for you!