Disaster Recovery and Major Incident Response Manager
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Active and Backup Responder on Duty (AROD/BROD)
- Serve as Active or Backup Responder on Duty (AROD/BROD) on a scheduled rotation for major incidents, declared disasters, and extended outages.
- Function as the single point of command and escalation during DR and major incidents in coordination with the Executive on Duty (EOD).
- Assess incident severity and determine when to escalate to disaster recovery activation in collaboration with IT EOD and business leadership.
- Coordinate cross‑functional response efforts involving infrastructure, application teams, cybersecurity, vendors, and business operations.
- Lead real‑time incident coordination calls and ensure clear task assignment, escalation, and decision tracking.
- Ensure effective shift handoffs, documentation continuity, and leadership coverage during prolonged or multi‑day incidents.
- Disaster Recovery Execution & Oversight
- Own the activation and execution of disaster recovery plans and runbooks during declared events.
- Coordinate technical recovery activities across infrastructure, platform, and application teams.
- Ensure application recovery is validated by appropriate application and business owners prior to declaring service restoration.
- Maintain operational oversight for prolonged recovery efforts, including shift coverage, resource planning, and vendor engagement.
- Ensure recovery actions are executed in accordance with approved DR standards, policies, and tiering requirements.
- DR Program Governance & Readiness
- Partner with the DR/BC Governance function to maintain enterprise DR readiness across all application tiers.
- Own the creation, maintenance, and continuous improvement of disaster recovery and major incident playbooks to ensure they are: Present for all in‑scope applications
- Technically accurate and executable
- Reviewed and validated on a defined cadence.
- Partner with IT Operations, Infrastructure, Applications, and Cybersecurity teams to validate technical accuracy and operational effectiveness of disaster recovery and major incident playbooks.
- Support application tiering decisions and ensure recovery strategies align to business impact and risk tolerance.
- Lead the planning, execution, and facilitation of disaster recovery testing, tabletop exercises, and simulations; ensure findings are documented and tracked to closure.
- Ensure exercise outcomes, identified gaps, and remediation actions are documented, tracked, and resolved within defined timeframes.
- Ensure DR processes align with internal policies, regulatory requirements, and audit expectations.
- Post‑Incident Review & Continuous Improvement
- Lead the creation of Root Cause Analysis (RCA) documents and/or postmortem reviews following major incidents and disaster recovery events.
- Ensure lessons learned, control gaps, and process improvements are documented and assigned to accountable owners.
- Track remediation actions through completion and provide status updates to leadership and governance committees.
- Identify recurring incident patterns or recovery risks and recommend corrective actions.
- Develop and presen
Benefits
Additional Information
How you move is why we're here. ® Now more than ever. Get back to what you need and love to do. The possibilities are endless... Now more than ever, our guiding principles are helping us in our search for exceptional talent - candidates who align with our unique workplace culture and who want to maximize the abundant opportunities for growth and success. If this describes you then let's talk! HSS is consistently among the top-ranked hospitals for orthopedics and rheumatology by U.S. News & World Report. As a recipient of the Magnet Award for Nursing Excellence, HSS was the first hospital in New York City to receive the distinguished designation. Whether you are early in your career or an expert in your field, you will find HSS an innovative, supportive and inclusive environment. Working with colleagues who love what they do and are deeply committed to our Mission, you too can be part of our transformation across the enterprise. Emp Status Regular Full time Work Shift Compensation Range The base pay scale for this position is $128,500.00 - $196,375.00. In addition, this position will be eligible for additional benefits consistent with the role. The salary of the finalist selected for this role will be determined based on various factors, including but not limited to: scope of role, level of experience, education, accomplishments, internal equity, budget, and subject to Fair Market Value evaluation. The hiring range listed is a good faith determination of potential compensation at the time of this job advertisement and may be modified in the future.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at hss? Share your experience