Skip to main content
Back to jobs

Principal Site Reliability Engineer

External
digicert logoDigicert · Lehi, UT
Full-timeOn-siteToday
AWSAzureBashCI/CDComplianceDNS
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

DigiCert is a global leader in intelligent trust. We protect the digital world by ensuring the security, privacy, and authenticity of every interaction. Our AI-powered DigiCert ONE platform unifies PKI, DNS, and certificate lifecycle management, to secure infrastructure, software, devices, messages, AI content and agents. Learn why more than 100,000 organizations, including 90% of the Fortune 500, choose DigiCert to stop today's threats and prepare for a quantum-safe future at www.digicert.com Job summary The Platform Ops team within CloudOps is responsible for the reliability, scalability, and modernization of DigiCert's cloud infrastructure. As a Principle SRE, you will own the intersection of software engineering and operations-driving automation-first practices, reducing toil, and accelerating our cloud transformation across AWS, Azure, and GCP environments. You will be a technical force multiplier: raising reliability standards across the organization, defining SLOs that matter, and building the internal platforms and tooling that enable product teams to ship with confidence.

Responsibilities

  • Reliability Engineering
  • Define, implement, and own SLIs, SLOs, and error budgets for critical platform services
  • Lead blameless post-mortems and drive systemic reliability improvements across the platform
  • Design and implement observability pipelines (metrics, logs, traces) using tools such as Splunk, Prometheus, Grafana, or OpenTelemetry
  • Participate in on-call rotation and serve as an incident commander for P0/P1 events
  • Cloud Modernization
  • Architect and execute migration strategies from legacy infrastructure to cloud-native patterns (containers, serverless, managed services)
  • Champion adoption of Kubernetes, service mesh, and managed cloud services (EKS, GKE, AKS)
  • Evaluate and introduce emerging cloud technologies that improve availability, cost efficiency, and developer experience
  • Partner with architecture and security teams to embed reliability and compliance into platform design
  • Automation & Platform Development
  • Build and maintain infrastructure-as-code using Terraform across multi-cloud environments
  • Develop internal tooling, self-service platforms, and golden-path templates that reduce operational burden for development teams
  • Automate operational workflows including provisioning, scaling, patching, and secret rotation
  • Contribute to and maintain CI/CD pipelines (GitHub Actions) to enable safe, frequent deployments
  • Engineering Leadership
  • Mentor mid-level engineers on SRE principles, distributed systems, and infrastructure best practices
  • Collaborate cross-functionally with product, security, and compliance teams to deliver on platform roadmap commitments
  • Document architectural decisions, runbooks, and platform standards; raise the engineering bar through code and design reviews
  • What you will have
  • 5+ years of experience in SRE, platform engineering, or infrastructure engineering roles
  • Deep proficiency in at least one major cloud provider (AWS, GCP, or Azure) with working knowledge of multi-cloud environments
  • Strong software engineering skills in Python, Go, or Bash; comfortable writing production-grade automation and tooling
  • Hands-on Kubernetes experience: cluster operations, workload management, networking (CNI/service mesh), and security (RBAC, pod security)
  • Infrastructure-as-code expertise with Terraform or equivalent; experience with GitOps workflows
  • Proven experience designing and operating observability systems and responding to production incidents at scale
  • Strong understanding of networking fundamentals: DNS, TLS/PKI, load balancing, and zero-trust networking concepts

Requirements

  • Experience in PKI, certificate lifecycle management, or security-adjacent infrastructure
  • Familiarity with compliance frameworks such as SOC 2, FedRAMP, or ISO 27001 in cloud environments
  • Prior experience driving cloud migration or modernization programs at scale
  • Contributions to open-source infrastructure or platform projects
  • AWS/GCP/Azure professional-level certifications (e.g., AWS Solutions Architect Professional, CKA/CKS)
  • What success looks like
  • Working at DigiCert CloudOps
  • Greenfield modernization: we are actively migrating workloads and building new platform capabilities-you'll shape the architecture, not just maintain it
  • Engineering-first culture with a strong bias toward automation, GitOps, and platform thinking
  • Cross-functional visibility: PlatformOps partners directly with product, security, and compliance-your work has organization-wide impact
  • Competitive compensation, equ

Benefits

Vision insurance

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at digicert? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect