Senior AIOps Engineer, Incident Response [Remote-US]

External

Quanata · Worldwide

$215K–$280K/yrFull-timeRemote3w ago

AWSConfluenceDocumentationIncident ResponseJiraLeadership

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

Quanata is on a mission to help ensure a better world through context-based insurance solutions. We are an exceptional, customer centered team with a passion for creating innovative technologies, digital products, and brands. We blend some of the best Silicon Valley talent and cutting-edge thinking with the long-term backing of leading insurer, State Farm. Learn more about us and our work at quanata.com Our Team Quanata, LLC is an insurance technology innovation company that engineers advanced risk prediction and prevention solutions, develops risk-focused acquisition capabilities, and builds/supports a full-stack, flexible, digital & increasingly AI-native insurance platform. This helps our primary clients, State Farm and HiRoad Assurance Company, adapt to evolving market needs. Quanata, LLC is wholly owned and funded by State Farm. As a company that prioritizes an inclusive and positive culture, we believe the core of our success is in hiring talented people - across disciplines - who want to help us make a quantifiable impact. We're looking for an experienced production operations and reliability leader to help evolve Quanata's operational support model through AI-driven automation and intelligent agent workflows. This role will own production health, incident response, and operational reliability while partnering closely with engineering and AI orchestration teams to improve scalability, reduce operational toil, and accelerate issue resolution. This is a highly collaborative role for someone who enjoys solving complex production problems, improving systems at scale, and helping modernize operations through AI-native tooling and automation. Your Day-to-Day Own production health, reliability, and operational support processes across critical systems and services Lead incident response efforts, stakeholder communication, root cause analysis, and post-incident reviews Identify patterns in production issues and drive improvements to reduce recurring incidents and operational overhead Design and implement AI-driven agents and workflows that automate support and operational tasks Partner with engineering, product, and AI orchestration teams to improve system resilience and operational efficiency Build and maintain operational runbooks, documentation, and knowledge base content for both human and AI-assisted workflows Support observability, monitoring, and troubleshooting efforts across cloud-based production environments Participate in on-call rotations and continuously improve operational readiness and response processes About You 6-8 years of experience in production operations, site reliability engineering, technical support engineering, or similar operational roles Strong background in incident management, root cause analysis, and production system troubleshooting Experience working within modern SDLC, DevOps, and change management environments Familiarity with operational tooling such as Jira, Confluence, and observability/monitoring platforms Strong analytical and problem-solving skills with the ability to identify trends and drive operational improvements Comfortable working cross-functionally with engineering, product, operations, and leadership teams Strong communication skills and ability to operate effectively in fast-moving technical environments Bachelor's degree in Computer Science, Engineering, or equivalent relevant experience Bonus Points Experience building or working with AI/LLM-powered systems, intelligent agents, or workflow automation tools Familiarity with cloud platforms such as AWS and modern observability ecosystems Experience with event-driven architectures, orchestration frameworks, or operational automation platforms Background leading operational transformation or reliability improvement initiatives Passion for AI-native operations, automation, and improving developer/support experiences Salary: $215,000 to $280,000* *Please note that the final salary offered will be determined based on the selected candidate's skills, and experience, as well as the internal salary structure at Quanata. Our aim is to offer a competitive and equitable compensation package that reflects the candidate's expertise and contributions to our organization. Additional Details: Benefits : We provide a wide variety of health, wellness and other benefits.These include medical, dental, vision, life insurance and supplemental income plans for you and your dependents, a Headspace app subscription, monthly wellness allowance and a 401(k) Plan with a co

Benefits

Health insuranceDental insuranceVision insurance401(k)Flexible schedulePerformance bonus

Additional Information

To help keep everyone safe, we encourage all applicants to pay close attention to protect themselves during their job search. When applying for a position online you are at risk of being targeted by malicious actors looking for personal data. Please be aware we will only reach out via email using the domain quanata.com. Anything that does not match those domains should be ignored and considered a security risk.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at quanata? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect