Skip to main content
Back to jobs

Principal Site Reliability Engineer

External
Okta logoOkta · Bengaluru, India
Full-timeOn-site1d ago
LeadershipMoveObservabilityPythonTerraform
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Reliability Strategy & Architecture
  • Define and drive the reliability strategy for critical product and platform services.
  • Establish standards for availability, resilience, observability, incident management, and operational readiness.
  • Lead architecture reviews for critical services and platform initiatives.
  • Partner with engineering leaders to ensure reliability objectives align with business priorities and customer expectations.
  • Create frameworks, standards, and operational guardrails that enable engineering teams to operate safely at scale.
  • Guide service architecture toward simplicity, scalability, resilience, and operational excellence.
  • Drive major initiatives that improve platform maturity and long-term sustainability.
  • Product & Platform Leadership
  • Own reliability architecture and operational excellence for the Spera / ISPM product area.
  • Collaborate closely with engineering leadership to establish reliability objectives and technical roadmaps.
  • Lead large-scale scalability, resiliency, and performance initiatives.
  • Partner with platform and product engineering teams to build self-service operational capabilities that improve developer productivity while strengthening reliability and security.
  • Influence technical direction through data-driven recommendations, engineering expertise, and collaborative leadership.
  • Support highly available, large-scale cloud environments as part of an on-call rotation.
  • Engineering & Automation
  • Design, build, and operate large-scale cloud infrastructure and production services.
  • Develop software, automation, and infrastructure using Go, Python, Terraform, and related technologies.
  • Eliminate operational toil through automation, tooling, and platform engineering.
  • Improve deployment safety, operational workflows, and platform consistency through GitOps and Infrastructure-as-Code practices.
  • Collaborate on modernizing existing workloads and aligning them with evolving platform capabilities.
  • Lead complex engineering initiatives from conception through production rollout and long-term operational ownership.
  • Technical Leadership
  • Mentor Staff and Senior engineers across multiple teams and organizations.
  • Lead technical reviews, design reviews, and operational readiness assessments.
  • Build engineering consensus across teams with differing priorities and objectives.
  • Help develop the next generation of technical leaders within Okta.
  • Drive adoption of reliability engineering best practices across

Benefits

Flexible schedule

Additional Information

Secure Every Identity, from AI to Human Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence. This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk. Get to know Okta Okta is The World's Identity Company. We free everyone to safely use any technology-anywhere, on any device or app. Our Workforce and Customer Identity Clouds enable secure yet flexible access, authentication, and automation that transforms how people move through the digital world, putting Identity at the heart of business security and growth. At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box, we're looking for lifelong learners and people who can make us better with their unique experiences. Join our team! We're building a world where Identity belongs to you. The Engineering Opportunity We are seeking a Principal Site Reliability Engineer to serve as a technical leader for reliability engineering within Okta's Emerging Products Group (EPG). This role extends beyond operating production systems. You will define technical strategy, influence platform architecture, establish reliability standards, and lead transformational initiatives that improve scalability, resilience, security, and operational excellence for one of Okta's fastest-growing product areas. Initially, you will partner closely with the Spera / Identity Security Posture Management (ISPM) engineering organization to establish reliability strategy, operational excellence, and platform maturity. Over time, you will help drive broader reliability initiatives across EPG and contribute to the evolution of reliability engineering practices across multiple products including Workflows, IGA, PAM, and ISPM. You will work closely with engineering leadership, product leadership, architects, and Staff engineers to shape the future of Okta's cloud infrastructure and reliability practices. The ideal candidate combines deep technical expertise with strong organizational influence and has a proven track record of leading large-scale engineering initiatives that drive measurable business outcomes.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Okta? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect