Incident Response Analyst II
ExternalFull-timeOn-siteToday
AirflowComplianceDocumentation
Prepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Incident and Event Management
- Investigate, acknowledge, and respond to alarms and abnormal operating conditions.
- Act as the first line of defense for facility events using monitoring and automation platforms.
- Assess the severity and operational impact of incidents and determine appropriate escalation paths.
- Coordinate incident bridges and communication during critical events.
- Serve as incident coordinators during major facility incidents.
- Maintain incident records and update ticketing systems with detailed actions and event timelines.
- Facilitate communications with site operations teams, vendors, engineering teams, and management stakeholders.
- Conduct preliminary root cause analysis (RCA) and identify recurring issues.
- Support operational improvement initiatives and lessons learned activities.
- Ensure compliance with SOPs, MOPs, EOPs, Runbooks, and Playbooks.
- This position is shift rotation, 24x7 operation.
- Facilities Monitoring and Alarm Operations
- The FOC continuously monitors critical facility infrastructure to ensure uptime, reliability, and operational stability of the data center environment.
- Monitor Building Management Systems (BMS), Data Center Infrastructure Management (DCIM), and Electrical Power Monitoring Systems (EPMS).
- Identify, classify, and acknowledge alarms.
- Evaluate incident criticality and operational impact.
- Escalate issues to on-site technicians, facilities engineers, or management in accordance with escalation procedures.
- Track incidents through resolution and maintain communication with stakeholders.
- Ensure all alarm activities are documented accurately within ticketing systems.
- Perform duties in accordance with SOPs, MOPs, EOPs, Runbooks, and Playbooks.
- Monitor Closed-Circuit Television (CCTV). Familiarity with systems such as Lenel, Genetec, and Avigilon is preferred.
- Review camera footage to validate incidents and support investigations.
- Maintain incident reports and event logs.
- Follow SOPs, MOPs, EOPs, Runbooks, and Playbooks.
- Familiarity with systems such as Lenel, Genetec, and Avigilon is preferred.
- Critical Event and Emergency Response
- FOC Analysts support the management of emergency situations and critical infrastructure events affecting data center operations.
- Coordinate response activities during utility outages, equipment failures, environmental alarms, and emergency conditions.
- Maintain communication bridges and provide status updates to stakeholders.
- Support emergency procedures during fire alarms, generator operations, cooling failures, and site evacuation events.
- Coordinate with vendors, facilities engineers, and local site teams to expedite restoration efforts.
- Document event timelines, response actions, and lessons learned.
- Participate in emergency drills and business continuity exercises.
- Ensure adherence to Emergency Operating Procedures (EOPs).
- Reporting and Continuous Improvement
- FOC Analysts contribute to operational excellence by maintaining accurate records and supporting process improvements.
- Produce incident reports and shift summaries.
- Maintain accurate documentation of alarms, escalations, and corrective actions.
- Support trend analysis and recurring issue identification.
- Participate in Root Cause Analysis (RCA) and post-incident reviews.
- Recommend improvements to procedures, runbooks, and escalation paths.
- Assist with KPI and SLA reporting.
- Support continuous improvement initiatives and operational excellence programs.
Requirements
- Required Qualifications
- 2+ years of experience in a Facilities Operations Center (FOC), Network Operations Center (NOC), Command Center, Critical Environment, or similar 24x7 operational environment.
- Experience supporting mission-critical facilities or data center operations.
- Technical Knowledge
- Working knowledge of:
- Building Management Systems (BMS)
- Data Center Infrastructure Management (DCIM)
- Electrical Power Monitoring Systems (EPMS)
- Critical power and cooling infrastructure
- Fire detection and suppression systems
- Environmental monitoring systems
- CCTV and Access Control Systems
- Incident management and ticketing systems
- Soft Skills
- Strong analytical and problem-solving skills.
- Ability to prioritize and manage multiple concurrent incidents.
- Excellent written and verbal communication skills.
- Abi
Benefits
Vision insurance
Additional Information
Knowledge, Skills & Abilities:
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at astreya? Share your experience