Staff Infrastructure Engineer - Kubernetes platform
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
SentinelOne is a company at the intersection of AI and security, pioneering a new operating model for cybersecurity. Our AI-native platform unifies protection across endpoint, cloud, identity, data, and AI systems to deliver autonomous detection and response with clarity and speed. By combining real-time analytics, intelligent automation, and a unified data foundation, we reduce noise, simplify complexity, and empower security teams to focus on what truly matters. Our teams are builders, problem-solvers, and innovators committed to shaping the future of security. If you are excited to solve hard problems alongside talented, mission-driven people, we invite you to help us build a safer future for humanity. What Are We Looking For? We're looking for people who are relentlessly curious and committed to continuous learning. AI is reshaping every function across our business, and we enable every team member, regardless of role or level, to build fluency in AI tools and concepts. Those who thrive here actively seek out new solutions, experiment thoughtfully, and apply what they learn to drive better, faster, smarter outcomes. As a Staff (techlead-level) Infrastructure Engineer , you will lead the design, implementation, and evolution of our cloud infrastructure platforms that power SentinelOne's products at scale. You will drive complex, cross‑functional initiatives end‑to‑end, set technical direction for Kubernetes and cloud infrastructure (70+ K8s clusters, tens of thousands of nodes, multi‑region AWS (EKS) and GCP (GKE), multi‑account setup), and raise the bar for reliability, performance, and security. You will act as a go‑to expert and mentor for other engineers, helping them develop, deliver, and operate services efficiently on top of our platform. What Will You Do? Primary responsibilities include : Own, design, build, and evolve systems and large, impactful initiatives (complex, ambiguous problems) across Kubernetes and cloud infrastructure that allow our engineers to deliver features safely, quickly, and reliably - such as the cloud infrastructure platform we're building. Maintain and enhance large‑scale Kubernetes infrastructure, troubleshoot and resolve complex issues in the platform and related cloud services, including performance, scalability, and reliability, as well as H-A, observability, security, and cost-efficiency challenges. Ensure our Kubernetes infra and surrounding ecosystem are fully automated using IaC (Terraform, incl. writing, maintaining and improving modules & tooling) and GitOps (ArgoCD) and adhere to best practices in security and compliance. You will also define & evolve standards and patterns across teams and guide others in adopting these practices. Support Engineering teams as a trusted partner, helping them adopt and utilize the platform effectively (repeatable, self‑service infrastructure), and influence roadmaps with infrastructure & reliability considerations. Own and continuously improve key metrics around performance, throughput, reliability, and failure rates, including capacity planning and deployment strategies. Contribute to and sometimes lead incident response efforts, post‑incident reviews, and systemic improvements to prevent recurrence. As Staff Engineer, you will be expected to mentor and coach other engineers, fostering a culture of knowledge sharing, continuous learning, and high‑quality engineering practices. What Skills and Knowledge Will You Bring? Ideal candidates will have: Multiple years of hands-on Kubernetes administrator experience, including maintenance, configuration, and operations in large-scale production environments - such as dozens of clusters, hundreds of nodes, multi‑region, or multi‑account setups. Experience with managed offerings like EKS and/or GKE. KOPS and others are a plus. Strong experience with ingress controllers configuration and troubleshooting (like Traefik or Gloo) and production experience maintaining standard add-ons (like cert‑manager, Karpenter, Keda, Cilium, CoreDNS etc). Deep, practical experience with GitOps tooling and workflows, ideally ArgoCD and/or Flux; with CI/CD pipelines, ideally GitHub Actions, including designing and operating workflows for infrastructure and application delivery. Strong experience with Terraform or Terragrunt, including designing and maintaining reusable modules and shared infrastructure components. Ability to design automation and tooling to streamline day-t
Additional Information
Our Purpose At SentinelOne, we are driven by a clear purpose: to give the advantage to those who secure our future. As AI reshapes how organizations build, operate, and innovate, the responsibility to protect them becomes more critical than ever. When you join SentinelOne, your work helps protect global enterprises, critical infrastructure, and the technologies shaping tomorrow. If you are motivated by meaningful challenges and want your impact to be real, measurable, and global, you will find purpose here.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at sentinellabs? Share your experience