AI DevOps & Reliability Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Delivery & Release Engineering
- Design and expand deployment automation, advancing the org toward on-demand and continuous production releases.
- Establish release practices and standards: progressive delivery, rollback, release tracking, deployment inventory teams can trust.
- Extend automation deeper into production paths, reducing manual steps and release toil.
- Enable verification through automation: quality gates as code, build engineering supports our efforts.
- Pipelines & Guardrails
- Own CI/CD standards across teams: quality gates, automated checks, guardrails that catch problems before production.
- Build pipeline tooling that makes the safe path the easy path for engineers.
- Environments
- Design and build out dev, staging, and on-demand (ephemeral) environments that mirror production and spin up on request.
- Treat environment provisioning as a product: fast, reproducible, self-service.
- AI-Embedded Ops
- Bring AI tooling into operations: automated runbook generation, intelligent alerting, AI-assisted incident response, operational tooling.
- Help build an org-wide, AI-augmented ops practice and share patterns across teams.
- This is a core part of the role, aligned with Branch's broader AI direction.
- Infrastructure & GitOps
- Champion Infrastructure as Code (Terraform / CloudFormation) for provisioning, configuration, and lifecycle management.
- Drive GitOps-based delivery with Argo CD for secure, repeatable, scalable deployments across Kubernetes.
- Operational Reliability
- Bring a strong reliability foundation: alerting practices, on-call, runbooks, SLI/SLO definition, incident response.
- Partner with engineering teams on the operational practices that keep their services healthy at high volume.
- Operate and tune high-volume data infrastructure: streaming pipelines (Kafka) and SQL/NoSQL datastores under heavy production load.
- Strengthen team-level runbooks, operational readiness, and production hygiene; feed improvements back into the platform.
- Embedded Team Work
- Embed with an assigned engineering team day-to-day, working hands-on with them on infrastructure, deployment, and reliability work.
- Mentor team engineers on operational best practices, observability, and reliability.
- Help build the team's capability over time so good practices stick.
- Engineering Metrics
- Stand up DORA metrics (lead time, deployment frequency, change failure rate, MTTR) and use them to target real improvements.
- Make delivery and reliability health visible to teams and leadership.
- Leadership & Partnership
- Work with engineering leadership on the operations and delivery roadmap.
- Drive cross-team adoption of standards and tooling through collaboration and influence.
Requirements
- Hands-on experience adopting AI into DevOps and SRE practices (Claude Code, Cursor, agents
Benefits
Additional Information
At Branch, we power every touchpoint with links that work and insights that prove it. From click to conversion, we make growth measurable. Our unparalleled attribution, backed by AI-enhanced linking, is trusted to deliver seamless experiences that increase ROI, decrease wasted spend, and eliminate siloed attribution. We bring the same rigor to how we build our team, by empowering our people to move fast, own outcomes, and build something that matters. We take pride in making meaningful investments in our team's health, wealth, and growth so individuals can thrive as we scale. Our culture values smart, humble, and collaborative teammates who take accountability and drive results in an environment where their work truly moves the business forward. We are innovative, scaling with purpose, and led by seasoned leaders who know how to build enduring companies. Trusted by brands like Instacart, Western Union, NBCUniversal, ZocDoc, and Sephora, we're big enough to matter, small enough for you to make a real impact. If you're excited by the grit of building, rapid learning, and shaping the future of customer growth, you'll find your place here. About The Group We're hiring an AI DevOps & Reliability Engineer to own how software ships and runs at Branch. The role has two areas: half central platform and standards work, half embedded with an engineering team. Centrally, you'll build and operate the delivery platform (CI/CD pipelines, deployment automation, environments) so teams can release safely, frequently, and on demand. Embedded, you'll work hands-on with an engineering team day-to-day on their infrastructure, deployment, and operational practices, mentoring them and building their capability over time. You'll also lead the adoption of AI in DevOps and SRE work at Branch. Bringing modern AI tooling (Claude Code, agentic workflows) into runbook generation, alerting, incident response, and operational tooling is a core part of this role, not a side project. It's a strategic direction we're committed to. As a lead, you'll work directly with engineering leadership to shape the operations and delivery roadmap across multiple milestones.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at branchmetrics? Share your experience