Senior Site Reliability Engineering Manager
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Job Description Building trusted markets - powered by our people At Cboe Global Markets, we inspire our people to solve complex challenges together because what we do matters. We provide the financial infrastructure that powers the global economy. As a leading provider of market infrastructure and tradable products, Cboe delivers cutting-edge trading, clearing and investment solutions to market participants around the world. We're building meaningful ways to support professional and personal development while strengthening the trust we've earned as a global market leader. Our teams are empowered to share ideas, actively pursue them and bring on a challenge. As champions of internal mobility and access to opportunity, we encourage our people to "go for it" and equip our managers with the training to coach their teams to the next level. We strive to provide employees a safe space to network, share ideas and create opportunities. Sound like the place for you? Join us! Role Overview The Sr. Manager, Site Reliability Engineering (London) is an experienced leader responsible for overseeing a globally distributed team of SRE technologists with diverse skills ranging from software development to systems, network, application, and/or database management - with deep subject matter expertise in one or more of these disciplines. This role sits at the heart of Cboe's follow-the-sun support model for its US Global Trading Hours (GTH) markets. Based in London, the Sr. SRE Manager provides direct platform support for Cboe's European operations while also holding oversight responsibility for SRE staff across both the European and Asia-Pacific time zones, ensuring seamless, continuous coverage of Cboe's real-time low-latency trading platforms around the clock. The Sr. SRE Manager will play a key role supporting and providing guidance throughout the full project lifecycle to deliver operational requirements on schedule, drive strategy across multiple areas of the organization, and tackle complex problems that may lack clear or full strategic definition. This individual must be capable of applying their knowledge in a manner that builds stakeholder confidence and achieves desirable outcomes while preserving relationships. Major Job Duties Technical Leadership & System Availability: Provide technical leadership, support, and operational oversight to sustain resiliency and high availability of critical business operations across European and GTH market sessions. Monitor Cboe production, disaster recovery, and certification systems for issues. Troubleshoot and drive resolution of issues. Analyze and optimize performance of real-time trading platforms. Oversee daily system checks and ensure Cboe platforms and systems are operating as expected. Take direct action to resolve known issues as needed. Assist the build team to resolve build/deployment issues. People Leadership & Team Development: Lead, mentor, and provide guidance to direct reports across the European and APAC time zones responsible for platform support. Delegate assignments to direct reports. Create and execute agile based processes such as Kanban and Scrum to ac tively manage the workload of the team, ensuring task completion in support of business projects and internal customer timelines. Actively and intentionally connect direct reports to others within their team, department, and across the organization. Support training and development needs to create a best-in-class SRE team . Establish operational objectives , policies, and procedures. Interact regularly with management on matters concerning multiple functional areas, departments, and/or customers. Liaise with business associates, infrastructure engineers, software engineers, and Cboe management. Platform Configuration Management & Project Oversight: Develop and manage operational initiatives to deliver tactical results. Translate functional plans into operational processes and guide execution, providing project management support for all updates applicable to platforms of responsibility. Provide for configuration management of new and existing trading platforms and support implementation of new features and functionality based on new business requirements. While the primary focus of this role involves support of bare-metal on-premises infrastructure, experience with cloud platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes) is desirable. Monitor development activities, change management tickets, and evaluate their impact on Cboe Operations. Approve and execute daily change tickets assigned to Site Reliability Engineering. Organize testing of changes prior to deployment and work with software engineering to resolve systemic issues. Demonstrate knowledge of Compliance obligations impacting regulated platforms and work closely with Compliance staff to ensure incident triage, reporting, and remediation obligations are met. Incident Response & Escalation Managemen