Operations Engineering Manager, Fleet Reliability
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Build and lead a 24/7 team of process-oriented, reliability and observability-focused engineers.
- Lead the socialization and documentation of clear and consistent processes for provisioning, validating and troubleshooting nodes in our server fleet.
- Think critically about and advocate for process and automation improvements prioritizing event-driven automated remediation as the end goal.
- Provide a 24/7 engineering support function for high-criticality, time-sensitive node delivery and maintenance.
- Drive and improve our program of onboarding, documentation, enablement, and performance management to help your team members achieve new heights of personal growth and capability.
- Drive the culture and tone for how your team keeps score both in how they communicate with and support each other and how they enable the rest of CoreWeave.
Requirements
- You have seven or more years of experience in a software or infrastructure engineering industry, of which at least two years were in a leadership capacity.
- You have a background that includes the knowledge and practice of SRE fundamentals, incident management, blameless culture, observability, and change management.
- You believe in the value of automation and will champion practices that drive reliability and drive the adoption of cross-team processes and tooling.
- You love helping people on their journeys to become their best selves and are comfortable extending the range of your influence to partners, peers, and senior leadership.
- Why CoreWeave?
- Be Curious at Your Core
- Act Like an Owner
- Empower Employees
- Deliver Best-in-Class Client Experiences
- Achieve More Together
- To fulfill our obligation to protect client data, successful applicants offered employment with CoreWeave will be required to complete a basic criminal record check, conducted in compliance with GDPR. Employment offers are conditional upon receiving satisfactory check results
Benefits
Additional Information
CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at www.coreweave.com . We're proud to be a Living Wage accredited Employer.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at coreweaveu? Share your experience