Skip to main content
Back to jobs

Vice President, Site Reliability Engineer

External
BlackRock logoBlackrock · Edinburgh, UK
Full-timeHybrid1w ago
ObservabilityRisk ManagementSite Reliability Engineering
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Act as a C lient-facing reliability partner, providing a clear point of coordination during incidents, escalations, onboarding, and major operational events
  • Assist with incident management, including technical coordination, issue narrative, stakeholder communication, and follow-through to resolution
  • Partner closely with Technology Client Experience, engineering, and platform teams to ensure reliability issues are understood, owned, and driven to closure end to end
  • Proactively support onboarding and operational readiness for top-tier clients by identifying systemic risks, validating supportability, and ensuring operational standards are met before scale
  • Translate recurring C lient pain points, escalation themes, and onboarding learnings into actionable systemic reliability improvements across products and platforms
  • Shift reliability left by engaging early in new Client onboarding, change planning, and design discussions to proactively surface risk s
  • Help navigate the organization to unblock remediation actions, align stakeholders, and accelerate resolution of high-priority client reliability issues
  • Improve engineering culture by reinforcing a deliberate, consistent, and non-reactive approach to client reliability partnership
  • Contribute to architectural, operational readiness, and observability discussions with a focus on client impact, resilience, and supportability
  • Design and improve monitoring, telemetry, and operational visibility for client-critical workflows and journeys
  • Drive detailed root cause investigations for significant client-impacting incidents, with strong focus on prevention and issue avoidance
  • Create and coordinate retrospectives for key incidents and onboarding events, ensuring learnings are captured and translated into concrete follow-up actions
  • Anticipate opportunities to strengthen the resiliency profile of systems and workflows most important to priority clients
  • Act as a culture carrier for SRE principles, helping teams connect engineering decisions to real client experience and trust
  • Skills/Qualifications

Requirements

  • B.S. / M.S. degree in Computer Science, Engineering or a related discipline with 5 - 8 years of experience
  • Strong experience in Site Reliability Engineering, production engineering, or a related reliability-focused role supporting critical systems
  • Demonstrated ability to manage complex incident escalations and coordinate effectively across engineering, product, operations, and stakeholder groups
  • Strong communication skills, including the ability to translate technical issues into clear, credible narratives for senior stakeholders and client-facing partners
  • Experience driving operational readiness, onboarding readiness, or production supportability reviews for high-scale systems or strategic initiatives
  • Strong troubleshooting and problem-solving skills, with the ability to identify both immediate remediation paths and underlying systemic issues
  • Passion for improving the reliability, resilience, and supportability of highly available systems
  • Experience with observability, monitoring, and telemetry tools used to detect, diagnose, and prevent incidents
  • Ability to build strong cross-functional relationships and influence outcomes without direct authority
  • Self-motivated, highly accountable, and comfortable operating in ambiguous, fast-moving environments
  • Knowledge of software development methodologies, release processes, and operational support models
  • Strong analytical thinking and a bias toward proactive risk

Additional Information

About this role BlackRock Company Overview: BlackRock is a global leader in investment management, risk management, and advisory services for institutional and retail clients. We help clients achieve their goals and overcome challenges with a range of products, including separate accounts, mutual funds, iShares® (exchange-traded funds), and other pooled investment vehicles. We also offer risk management, advisory, and enterprise investment system services to a broad base of institutional investors through BlackRock Solutions®. Headquartered in New York City, as of February 5, 2025, we handle approximately $11.5 trillion in assets under management (AUM) and have around 19,000 employees in offices across 38 countries, with a significant presence in key global markets, including North and South America, Europe, Asia, Australia, the Middle East, and Africa. Role Overview: We're seeking a Site Reliability Engineer (SRE) for a new Client Services-focused role that combines deep reliability engineering with strong C lient partnership. This role sits closely aligned with our Technology Client Experience team and complements our embedded SRE model by providing focused reliability engagement for priority C lients. You will act as a client-facing reliability partner - helping manage escalations, improve onboarding readiness, surface systemic risks, and translate client pain points into durable engineering improvements .


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at BlackRock? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect