Incident Response Engineer II
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Requirements
- 3-5 + years of experience in Incident Management, Production Support, Technical Operations, NOC, SRE, or related operational roles .
- Strong understanding of Incident Management, Major Incident Management, Problem Management, and Root Cause Analysis processes.
- Experience working in SaaS, Cloud, or Enterprise Software environments.
- Experience with monitoring and observability platforms such as API, Grafana , Kibana, or similar tools.
- Experience with Jira, Confluence, ServiceNow, PagerDuty, or similar incident management platforms.
- Strong troubleshooting, analytical, and problem-solving skills.
- Excellent verbal and written communication skills with the ability to manage technical and business stakeholders during critical incidents.
- Good understanding of APIs, Microservices Architecture, Databases, Distributed Systems, and Cloud technologies.
- Good knowledge of AWS, Azure, or GCP environments.
- Ability to work in a fast-paced, high-pressure environment and manage multiple priorities simultaneously.
- Experience in customer communication, stakeholder management, and operational reporting is preferred.
- Lead a new category of enterprise software that we call Unified-CXM.
- Empower companies to deliver next generation, unified engagement journeys that reimagine the customer experience.
- Create a culture of customer obsession, with trust, teamwork, and accountability.
Benefits
Additional Information
Sprinklr is the definitive, AI-native platform for Unified Customer Experience Management (Unified-CXM), empowering brands to deliver extraordinary experiences at scale - across every customer touchpoint. By combining human instinct with the speed and efficiency of AI, Sprinklr helps brands earn trust and loyalty through personalized, seamless, and efficient customer interactions. Sprinklr's unified platform provides powerful solutions for every customer-facing team - spanning social media management, marketing, advertising, customer feedback, and omnichannel contact center management - enabling enterprises to unify data, break down silos, and act on real-time insights. Today, 1,900+ enterprises and 60% of the Fortune 100 rely on Sprinklr to help them deliver consistent, trusted customer experiences worldwide. Job Description Analyze, troubleshoot, and resolve customer-impacting issues in a timely manner . Monitor production systems, service health dashboards, alerts, and operational tools to proactively identify customer-impacting incidents. Lead incident response activities and coordinate with Engineering, Product, Infrastructure, and Support teams to drive timely resolution. Perform incident triage, impact assessment, and severity classification for production incidents. Manage major incident bridges and ensure timely stakeholder communication throughout the incident lifecycle. Debug and analyze service disruptions, platform degradation, and customer-impacting issues, escalating when required . Collaborate with cross-functional teams to identify root causes and implement corrective and preventive actions. Drive Root Cause Analysis (RCA) reviews and ensure action items are tracked through closure. Review monitoring effectiveness and identify opportunities to improve alert coverage and operational visibility. Assist in developing and maintaining operational runbooks, incident documentation, and response procedures. Track incident trends, service reliability metrics, and operational performance to drive continuous improvement. Leverage incident insights and operational data to recommend reliability, monitoring, and process improvements.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at Sprinklr? Share your experience