Incident Manager

External

Assetmark · Charlotte, NC

$80K–$110K/yrFull-timeHybrid2w ago

AWSAzureDocumentationGCPJiraLeadership

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

End-to-End Problem Management (Sev1-Sev5)
Own production issues from detection through full resolution
Quickly assess impact and assign severity (Sev1-Sev5)
Lead triage, investigation, and resolution efforts
Maintain clear ownership throughout the lifecycle, regardless of which teams are involved
Drive fast, effective restoration of service
Resolve More, Closer to the Team
Directly investigate and resolve issues whenever possible
Partner closely with operations and reliability teams to resolve issues without unnecessary escalation
Reduce dependency on engineering teams for repeat or well-understood problems
Build reusable knowledge and patterns to improve team self-sufficiency
Root Cause Analysis & Prevention
Perform and/or lead root cause analysis (RCA)
Identify recurring patterns and systemic weaknesses
Drive fixes that prevent entire classes of issues from recurring
Ensure issues are fully resolved-not just temporarily mitigated
Incident Leadership & Communication
Lead real-time response for high-impact production issues
Coordinate cross-functional teams with clarity and urgency
Communicate clearly with stakeholders, including leadership, during active incidents
Provide structured updates on impact, progress, and next steps
Process, Tooling & Continuous Improvement
Improve incident management processes, workflows, and operating models
Build and maintain runbooks and response procedures
Identify opportunities for automation and better monitoring
Ensure high-quality documentation and knowledge sharing
What You Bring
Required Experience & Skills
5+ years of experience in incident management, site reliability engineering (SRE), production operations, or similar roles
Proven ability to lead and resolve production issues under pressure
Strong technical breadth across systems, applications, and infrastructure
Ability to diagnose and troubleshoot issues directly, not just coordinate response
Excellent communication skills-clear, concise, and composed under pressure
Strong sense of ownership and accountability
Analytical mindset with strong problem-solving skills

Requirements

Experience in high-availability, large-scale production environments
Familiarity with tools such as ServiceNow, Jira Service Management, or PagerDuty
Experience with cloud platforms (AWS, Azure, or GCP)
Familiarity with monitoring and observability tools
Knowledge of ITIL frameworks (helpful, but not required)
How We Measure Success
Success in this role is defined by outcomes:
Faster time to restore service (MTTR)
More issues resolved directly within the incident management / operations function
Reduction in high-severity issues (Sev1 / Sev2)
Fewer recurring issues due to strong root cause resolution
Improved system reliability and stakeholder confidence
What Makes This Role Different
You are not a ticket router-you are a problem solver
You don't just respond to incidents-you prevent them from happening again
You work across the stack-not within a silo
Your work directly improves both system reliability and engineering productivity
Compensation: The Base Salary range for this position is between $80,000-$110,000.
Candidates must be legally authorized to

Additional Information

Job Description: AssetMark is a leading strategic provider of innovative investment and consulting solutions serving independent financial advisors. We provide investment, relationship, and practice management solutions that advisors use in helping clients achieve wealth, independence, and purpose. The Job/What You'll Do: We are looking for an experienced Incident Manager to own the end-to-end lifecycle of production issues across our technology platforms and services. This role goes beyond traditional incident coordination. Incident Managers are hands-on operators responsible for driving rapid service restoration, resolving issues directly whenever possible, and eliminating recurring problems at their source. You will work across the full technology stack-partnering with engineering, infrastructure, and operations teams-to ensure reliable system performance and a high-quality user experience. This is a high-visibility role that requires strong technical judgment, clear communication under pressure, and a bias toward action. You will play a critical role in improving system reliability while helping teams spend less time firefighting and more time building. This role participates in a 24/7 operating model, including on-call responsibilities. We can only consider candidates for this position who are able to accommodate a hybrid work schedule and are close to our Charlotte, NC office.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at assetmark? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect