Lead the end-to-end management of Major IT Incidents, acting as the primary coordinator from initial detection through resolution and closure.
Establish and manage incident bridges, ensuring appropriate technical ownership, accountability, and progress tracking throughout the incident lifecycle.
Drive disciplined incident execution aligned to ITIL Incident Management standards, including escalation, communications, and closure requirements.
Business Impact & Decision Support
Engage key decision makers to assist with rapid assessment, articulate business impact, enabling prioritization, escalation decisions, and leadership engagement.
Provide concise, accurate, and executive-ready updates to IT leadership and business stakeholders during high-impact events.
Monitoring, Detection & Escalation
Maintain situational awareness across enterprise monitoring tools, NOC alerts, and collaboration channels to identify potential major incidents early.
Ensure unaddressed or emerging alerts are escalated appropriately and assigned to the correct resolver teams.
Proactively initiate early notifications when conditions indicate a potential escalation to Major Incident status.
Cross-Functional & Vendor Coordination
Coordinate internal IT teams and external service partners, ensuring the right technical resources are engaged at the right time.
Support effective collaboration between internal technical leaders and vendor-based resolver teams during critical incidents.
Communications & Stakeholder Engagement
Deliver timely, consistent, and accurate communications to impacted users, business partners, and leadership.
Support incident notification and communications processes using approved enterprise tools and channels.
Ensure communications remain aligned with verified technical status and approved messaging.
Continuous Improvement & Governance
Ensure incidents are fully documented, including timelines, decisions, actions, and outcomes.
Participate in or facilitate post-incident reviews, identifying root causes, lessons learned, and improvement opportunities.
Contribute to ongoing refinement of incident management processes, tools, and training materials.
Innovation & Automation
Identify opportunities to leverage automation, analytics, and AI-enabled capabilities to improve incident detection, response, and reporting.
Collaborate with relevant teams to recommend enhancements that increase operational efficiency and resilience.
Required :
Bachelor's Degree with 6 years' experience; Master's Degree with 5 years' experience; PhD with 0 years' experience.
Experience in IT Incident Management, Major Incident Management, or an equivalent operational leadership role.
Strong understanding of ITIL-based service management practices (ITIL certification preferred).
Proven experience coordinating and leading major incident bridge calls including the presentation of summaries, ensuring identified topics are addressed, and supporting cross-team collaboration to identify, document, and communicate key business details to leaders.
Ability to monitor technical discussions, document key details for capture, and ensure impact and scope assessments are adjusted and communicated appropriately.
Good understanding of IT infrastructure: servers, networks, databases, operating systems, and cloud environments.
Familiarity with monitoring, alerting, and event management tools.
Experience with ITSM tools such as ServiceNow, Jira Service Management, or similar platforms.
Working knowledge of root cause analysis (RCA) methods and post-incident review practices.
Understanding of cybersecurity basics, incident escalation, and service continuity.
Ability to read and interpret technical logs, alerts, dashboards, and system metrics.
Knowledge of application support, infrastructure dependencies, and production environments.
Familiarity with communication and collaboration tools used during incident bridges and war rooms.
Demonstrated ability to lead during high-pressure, time-sensitive situations.
Excellent written and verbal communication skills, with experience providing executive-
Additional Information
The Incident Duty Manager (IDM) is responsible for leading the enterprise response to high-severity IT incidents, ensuring rapid service restoration, minimal business impact, and clear, timely communication to stakeholders and leadership. Operating within the Incident Command Center (ICC), this role provides centralized coordination across technical teams, external service partners, monitoring functions, and SWATS activities.
The Incident Duty Manager plays a critical role in incident detection, escalation, command-and-control, and post-incident improvement, working in alignment with ITIL-based Incident Management practices. This position requires strong leadership under pressure, excellent communication skills, and the ability to operate effectively in a 24x7, high-availability environment.