Site Reliability Engineer - Disaster Recovery & Business Continuity
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- IT Service Continuity Documentation: Maintain and continuously improve IT service continuity and DR documentation, including service recovery plans, application recovery procedures, and dependencies.
- DR Runbooks & Recovery Sequencing: Partner with infrastructure and application teams to document, review, and standardize DR runbooks (recovery steps, prerequisites, validation checks, and recovery sequencing).
- Backup/Restore & Data Recovery Readiness: Coordinate evidence and periodic validation activities related to backups, restores, and data recovery procedures in collaboration with platform owners.
- Testing & Exercises (Technical and Tabletop): Plan and coordinate DR tests and resilience exercises, including scope, schedules, participant communications, success criteria, evidence collection, and after-action reporting.
- Issue Tracking & Continuous Improvement: Maintain remediation logs, drive follow-through on action items, and report status and risks to stakeholders; incorporate lessons learned into updated runbooks and plans.
- Operational Resilience Governance & Reporting: Track plan review cadence, test completion, and key metrics; support audit-ready evidence collection and risk/compliance requests related to IT resilience.
- Third-Party & Technology Dependency Coordination: Help assess IT resilience considerations for key vendors and dependencies (e.g., SaaS, telecom, data centers) and document contingency approaches with service owners.
- Incident Support & Recovery Communications: During disruptions, support coordination of technical recovery status updates and stakeholder communications in partnership with IT incident management and leadership.
- Business Continuity Coordination (Limited Scope): Support business impact analysis inputs (critical processes, contacts, workarounds) and coordinate periodic awareness/training for non-IT plan owners as needed.
- Relevant Skills & Experience
- Experience supporting IT service continuity and/or disaster recovery (DR) for enterprise services and applications, including runbook maintenance and coordinating technical exercises
- Working knowledge of resilience concepts and metrics (RTO, RPO), incident/change management, and IT service management practices
- Ability to coordinate cross-functional IT stakeholders (infrastructure, cloud, network, security, applications) and manage timelines, communications, and documentation across multiple workstreams
- Strong documentation and analytical skills; able to produce clear runbooks, test plans, after-action reports, and remediation tracking
- Familiarity with core IT platforms and enterprise services (identity, networking, virtualization, backups, Windows/Microsoft 365, cloud/SaaS) to understand recovery dependencies
- Comfort working with risk, compliance, and audit stakeholders; experience collecting evidence and supporting control attestations is a plus
- Exposure to business continuity activities (BIA inputs, plan owner coordinatio
Additional Information
About Charles River Associates For over 50 years, Charles River Associates has been a premier consulting firm that offers employees a place to learn from a diverse group of consultants, industry experts, and academics. At CRA you will be exposed to leading minds who use economic, financial, and business analysis to solve complex world problems for an impressive roster of clients, including major law firms, Fortune 100 companies, and government agencies. Through a collegial environment, formal and informal training opportunities, and a broad array of professional development resources, your experience at CRA will open doors for you throughout your career. The Information Technology (ITS) department at Charles River Associates is currently a team of more than 40 professionals dedicated to enhancing, maintaining, and developing the firm's technology infrastructure and security. The team is comprised of four functions: Service Delivery & Telecom Enterprise Application Solutions Infrastructure, Networking and Cloud Solutions Information Security Information Technology staff are based in the Boston, Chicago, London, Munich, New York, Oakland, San Francisco, College Station and Washington, DC offices. Mainly a Microsoft house, CRA is looking to maximize the performance of our on-premise systems and hybrid infrastructure, meaning experience with cloud technologies is essential for this role. Position Overview The IT Business Continuity Coordinator supports the firm's operational resilience with a primary focus on IT service continuity and disaster recovery (DR) readiness (approximately 80%), and a secondary focus on coordinating business continuity planning with non-IT stakeholders (approximately 20%). This role partners closely with infrastructure, security, application, and service delivery teams to maintain actionable recovery documentation, validate recovery capabilities through testing, and drive ongoing improvements through remediation tracking and lessons learned.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at charlesriverassociates? Share your experience