Skip to main content
Back to jobs

ITSM Incident & Problem Manager

External
convera logoConvera · Santa Ana
Full-timeOn-site2w ago
DatadogDocumentationGrafanaLeadershipMoveObservability
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Serve as the Incident Manager / Major Incident Manager for high-severity and business-impacting incidents by organizing incident bridges and war rooms, driving Rapid triage / Clear ownership / Timely decision-making
  • Ensure incidents are properly classified, prioritized, and escalated based on impact and urgency
  • ITSM Process Ownership & Governance
  • Enforce ITIL-aligned Incident and Problem Management practices
  • Ensure accurate and complete documentation within ServiceNow, including Impact and affected services / Incident timelines / Root cause summaries and follow-ups
  • Play the role of Problem Manager to Identify recurring issues and systemic risks / Ensure RCAs are completed with actionable outcomes
  • Act as a process authority during incidents, ensuring teams adhere to defined ITSM standards
  • Service Availability, Reliability & KPIs
  • Own operational oversight of service availability and reliability - Monitor and manage key service health indicators, including Service availability and uptime / Incident volumes and severity trends / MTTR and MTTD / SLA and OLA adherence
  • Use observability data to proactively identify service degradation and emerging risks
  • Escalate systemic availability or reliability concerns to leadership with data-backed insights
  • Observability & Operational Intelligence
  • Actively leverage observability platforms (e.g., Grafana, Datadog)
  • Partner with engineering and SRE teams to improve Monitoring coverage / Alert quality and signal-to-noise ratio
  • Ensure alerting and escalation via PagerDuty aligns with service criticality
  • Communication & Executive Engagement:
  • Serve as the primary communication lead during incidents - Deliver concise, executive-level updates that articulate Business impact / Current status / Mitigation steps / Next milestones
  • Translate complex technical details into clear business language
  • Maintain confidence and composure while engaging senior leaders during high-pressure events
  • Post-Incident & Continuous Improvement:
  • Facilitate or support post-incident reviews - Identify trends, gaps, and opportunities for Process improvement / Tooling enhancement / Better operational readiness
  • Contribute to the evolution of Command Center playbooks, runbooks, and response standards
  • Required Qualifications & Experience

Requirements

  • 3-6 years of experience in:
  • Incident Management
  • Major Incident / Command Center operations
  • Production operations or site reliability support
  • Proven experience managing high-severity incidents in 24×7 environments
  • Demonstrated ownership of service reliability and operational KPIs
  • ITSM & Process Expertise
  • Strong working knowledge of ITIL / ITSM frameworks
  • Deep hands-on experience with:
  • Major Incident workflows
  • Problem Management
  • Experience enforcing ITSM discipline across distributed technology teams
  • Skills & Competencies
  • Exceptional communication and facilitation skills
  • Strong analytical mindset with comfort using metrics and dashboards
  • Ability to operate decisively in high-pressure situations
  • Influences outcomes without formal authority
  • Comfortable interfacing with executive leadership
  • Experience in regulated or customer-critical environments (FinTech, Payments, SaaS)
  • Exposure to ITSM tools like ServiceNow, PagerDuty etc.
  • Exposure to monitoring tools like Datadog, Grafana, Dynatrace etc.
  • About Convera
  • Our teams care deeply about the value we bring to our customers, making Convera a rewarding workplace. This is an exciting time for our organization as we build our team with growth-minded, results-oriented people who are looking to move fast in an innovative environment.
  • As a truly global company with employees in over 20 countries, we are passionate about diversity; we seek and celebrate people from different backgrounds, lifestyles, and unique points of view. We want to work with the best people and ensure we foster a culture of inclusion and belonging.
  • We offer an abundance of competitive perks and benefits including:
  • Market competitive salary.
  • Great career growth and development opportunities in a global organization.
  • Hybrid schedule with 2 in the office per week.
  • Generous insurance (health, disability, life).
  • Paid holidays, time-off, and leave policies for life events (maternity, paternity, adoption).
  • Paid volunteering opportunities (5 days per year).
  • This position follows a shift roster defined by the company. As we operate in a 24/7 environme

Benefits

Health insurance

Additional Information

Incident & Major Incident management


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at convera? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect