Vice President, Site Reliability Engineer

External

Blackrock · Edinburgh, UK

Full-timeHybrid1w ago

ObservabilityRisk ManagementSite Reliability Engineering

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Act as a C lient-facing reliability partner, providing a clear point of coordination during incidents, escalations, onboarding, and major operational events
Assist with incident management, including technical coordination, issue narrative, stakeholder communication, and follow-through to resolution
Partner closely with Technology Client Experience, engineering, and platform teams to ensure reliability issues are understood, owned, and driven to closure end to end
Proactively support onboarding and operational readiness for top-tier clients by identifying systemic risks, validating supportability, and ensuring operational standards are met before scale
Translate recurring C lient pain points, escalation themes, and onboarding learnings into actionable systemic reliability improvements across products and platforms
Shift reliability left by engaging early in new Client onboarding, change planning, and design discussions to proactively surface risk s
Help navigate the organization to unblock remediation actions, align stakeholders, and accelerate resolution of high-priority client reliability issues
Improve engineering culture by reinforcing a deliberate, consistent, and non-reactive approach to client reliability partnership
Contribute to architectural, operational readiness, and observability discussions with a focus on client impact, resilience, and supportability
Design and improve monitoring, telemetry, and operational visibility for client-critical workflows and journeys
Drive detailed root cause investigations for significant client-impacting incidents, with strong focus on prevention and issue avoidance
Create and coordinate retrospectives for key incidents and onboarding events, ensuring learnings are captured and translated into concrete follow-up actions
Anticipate opportunities to strengthen the resiliency profile of systems and workflows most important to priority clients
Act as a culture carrier for SRE principles, helping teams connect engineering decisions to real client experience and trust
Skills/Qualifications

Requirements

B.S. / M.S. degree in Computer Science, Engineering or a related discipline with 5 - 8 years of experience
Strong experience in Site Reliability Engineering, production engineering, or a related reliability-focused role supporting critical systems
Demonstrated ability to manage complex incident escalations and coordinate effectively across engineering, product, operations, and stakeholder groups
Strong communication skills, including the ability to translate technical issues into clear, credible narratives for senior stakeholders and client-facing partners
Experience driving operational readiness, onboarding readiness, or production supportability reviews for high-scale systems or strategic initiatives
Strong troubleshooting and problem-solving skills, with the ability to identify both immediate remediation paths and underlying systemic issues
Passion for improving the reliability, resilience, and supportability of highly available systems
Experience with observability, monitoring, and telemetry tools used to detect, diagnose, and prevent incidents
Ability to build strong cross-functional relationships and influence outcomes without direct authority
Self-motivated, highly accountable, and comfortable operating in ambiguous, fast-moving environments
Knowledge of software development methodologies, release processes, and operational support models
Strong analytical thinking and a bias toward proactive risk

Additional Information

About this role BlackRock Company Overview: BlackRock is a global leader in investment management, risk management, and advisory services for institutional and retail clients. We help clients achieve their goals and overcome challenges with a range of products, including separate accounts, mutual funds, iShares® (exchange-traded funds), and other pooled investment vehicles. We also offer risk management, advisory, and enterprise investment system services to a broad base of institutional investors through BlackRock Solutions®. Headquartered in New York City, as of February 5, 2025, we handle approximately $11.5 trillion in assets under management (AUM) and have around 19,000 employees in offices across 38 countries, with a significant presence in key global markets, including North and South America, Europe, Asia, Australia, the Middle East, and Africa. Role Overview: We're seeking a Site Reliability Engineer (SRE) for a new Client Services-focused role that combines deep reliability engineering with strong C lient partnership. This role sits closely aligned with our Technology Client Experience team and complements our embedded SRE model by providing focused reliability engagement for priority C lients. You will act as a client-facing reliability partner - helping manage escalations, improve onboarding readiness, surface systemic risks, and translate client pain points into durable engineering improvements .

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at BlackRock? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect