Staff Software Engineer, Billing

External

Docker · Seattle, WA

$170K–$276K/yrFull-timeRemote1mo ago

AWSCI/CDDatadogDockerGitHubGitHub Actions

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

The Billing Platform Engineering team owns the systems that make Docker's commercial model real. You'll work on problems like:
How do we design infrastructure that makes AI-generated deployments safe to ship and easy to roll back?
How do we instrument billing systems so that failures - billing miscalculations, entitlement gaps, payment errors - are detected immediately and unambiguously?
How do we build infrastructure that scales with usage-based billing workloads without manual intervention?
How do we make the developer experience on this team faster and more reliable - local environments, CI/CD pipelines, deployment tooling?
Own and evolve the infrastructure supporting Billing Platform services: compute, storage, networking, CI/CD, and observability
Design and maintain IaC (Terraform) for billing system infrastructure on AWS; set module patterns and standards for the team
Build and own observability systems - metrics, logging, alerting - with a focus on billing accuracy and payment reliability
Define deployment patterns and runbooks that work well in an AI-agent-assisted development workflow: clear rollback procedures, safe promotion gates, automated validation
Partner with software engineers on service design - bringing infrastructure constraints and operational requirements into the conversation before code is written
Identify systemic risks and drive improvements that span team or organizational boundaries
Lead incident response for billing system issues. This role may require participation in an on-call rotation to provide support outside of standard business hours, including evenings, weekends, and holidays, as needed.
Mentor engineers across the team; your technical judgment should raise the floor for everyone
First 30 Days
You will ship code in your first week. We run an agent-first development workflow - infrastructure changes start with a plan, specifications are written before generation, and every change is reviewed before it merges -

Requirements

8+ years in platform, infrastructure, or SRE roles supporting production SaaS systems at scale
Deep AWS expertise: ECS or EKS, RDS (Postgres preferred), networking, IAM, cost management - you've operated these systems under real load and real incidents
Expert-level Terraform; you've designed reusable module patterns and set standards others follow
Experience building and owning observability stacks (Datadog, Grafana, or similar) at an organizational level - not just using them
Strong familiarity with CI/CD systems - Jenkins, GitHub Actions, or equivalent - including pipeline design and developer experience ownership
Kubernetes at an operational and architectural level
A track record of identifying systemic risks and driving improvements that span team or organizational boundaries
Security-first mindset: threat modeling, blast radius analysis, least-privilege by default, audit trails as a design requirement
Strong written English; at Staff level, written communication is how you scale your influence across teams
Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience
What sets you apart

Benefits

Remote work options

Additional Information

Docker has been one of the most loved brands in developer tooling, trusted by more than 20 million monthly users and over 20 billion container image pulls. From solo founders to the world's largest companies, developers rely on Docker to build, share, and run their applications across our suite of products including Docker Desktop, Docker Hub, and Docker Scout. We are a globally distributed, remote-first team building the tools that define how software gets built and delivered. As AI agents redefine software development, Docker is at the center of that shift, providing the sandboxed environments, verified images, and secure infrastructure that make autonomous workflows trustworthy by default. We're building AI-native development practices into how this team works at a foundational level. That means infrastructure design needs to account for a new kind of collaborator: AI agents that generate, deploy, and operate software. The Staff Infrastructure Engineer on this team will keep systems running, define what safe, observable, AI-assisted infrastructure operations look like in practice, and set the standard for how the broader engineering organization follows.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at docker? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect