Principal Site Reliability Engineer

External

Digicert · Lehi, UT

Full-timeOn-siteToday

AWSAzureBashCI/CDComplianceDNS

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

DigiCert is a global leader in intelligent trust. We protect the digital world by ensuring the security, privacy, and authenticity of every interaction. Our AI-powered DigiCert ONE platform unifies PKI, DNS, and certificate lifecycle management, to secure infrastructure, software, devices, messages, AI content and agents. Learn why more than 100,000 organizations, including 90% of the Fortune 500, choose DigiCert to stop today's threats and prepare for a quantum-safe future at www.digicert.com Job summary The Platform Ops team within CloudOps is responsible for the reliability, scalability, and modernization of DigiCert's cloud infrastructure. As a Principle SRE, you will own the intersection of software engineering and operations-driving automation-first practices, reducing toil, and accelerating our cloud transformation across AWS, Azure, and GCP environments. You will be a technical force multiplier: raising reliability standards across the organization, defining SLOs that matter, and building the internal platforms and tooling that enable product teams to ship with confidence.

Responsibilities

Reliability Engineering
Define, implement, and own SLIs, SLOs, and error budgets for critical platform services
Lead blameless post-mortems and drive systemic reliability improvements across the platform
Design and implement observability pipelines (metrics, logs, traces) using tools such as Splunk, Prometheus, Grafana, or OpenTelemetry
Participate in on-call rotation and serve as an incident commander for P0/P1 events
Cloud Modernization
Architect and execute migration strategies from legacy infrastructure to cloud-native patterns (containers, serverless, managed services)
Champion adoption of Kubernetes, service mesh, and managed cloud services (EKS, GKE, AKS)
Evaluate and introduce emerging cloud technologies that improve availability, cost efficiency, and developer experience
Partner with architecture and security teams to embed reliability and compliance into platform design
Automation & Platform Development
Build and maintain infrastructure-as-code using Terraform across multi-cloud environments
Develop internal tooling, self-service platforms, and golden-path templates that reduce operational burden for development teams
Automate operational workflows including provisioning, scaling, patching, and secret rotation
Contribute to and maintain CI/CD pipelines (GitHub Actions) to enable safe, frequent deployments
Engineering Leadership
Mentor mid-level engineers on SRE principles, distributed systems, and infrastructure best practices
Collaborate cross-functionally with product, security, and compliance teams to deliver on platform roadmap commitments
Document architectural decisions, runbooks, and platform standards; raise the engineering bar through code and design reviews
What you will have
5+ years of experience in SRE, platform engineering, or infrastructure engineering roles
Deep proficiency in at least one major cloud provider (AWS, GCP, or Azure) with working knowledge of multi-cloud environments
Strong software engineering skills in Python, Go, or Bash; comfortable writing production-grade automation and tooling
Hands-on Kubernetes experience: cluster operations, workload management, networking (CNI/service mesh), and security (RBAC, pod security)
Infrastructure-as-code expertise with Terraform or equivalent; experience with GitOps workflows
Proven experience designing and operating observability systems and responding to production incidents at scale
Strong understanding of networking fundamentals: DNS, TLS/PKI, load balancing, and zero-trust networking concepts

Requirements

Experience in PKI, certificate lifecycle management, or security-adjacent infrastructure
Familiarity with compliance frameworks such as SOC 2, FedRAMP, or ISO 27001 in cloud environments
Prior experience driving cloud migration or modernization programs at scale
Contributions to open-source infrastructure or platform projects
AWS/GCP/Azure professional-level certifications (e.g., AWS Solutions Architect Professional, CKA/CKS)
What success looks like
Working at DigiCert CloudOps
Greenfield modernization: we are actively migrating workloads and building new platform capabilities-you'll shape the architecture, not just maintain it
Engineering-first culture with a strong bias toward automation, GitOps, and platform thinking
Cross-functional visibility: PlatformOps partners directly with product, security, and compliance-your work has organization-wide impact
Competitive compensation, equ

Benefits

Vision insurance

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at digicert? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect