System Lead
ExternalS$108K–S$144K/yrContractUnknownToday
Information Technology
Prepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Team Leadership & People Management
- Lead and supervise a team and oversee 24/7 operational coverage of manpower
- Provide mentorship, coaching, and skill development for junior engineers.
- Conduct performance reviews, identify training gaps, and drive continuous improvement across the team.
- Act as the point-of-escalation for all operational matters.
- Data Centre Operations Management
- Ensure smooth day-to-day operation of all systems, networks, security tools, and DC facilities.
- Oversee all daily operations across data centre infrastructure, security, systems, and network domains to ensure efficiency, stability, and continuous service availability.
- Ensure compliance with Data Centre SOPs, operational policies, and security guidelines.
- Coordinate equipment movement, installation, decommissioning, and preventive maintenance activities.
- Review shift reports, incident logs, and ensure proper documentation for audits.
- Drive readiness, preventive maintenance, audit compliance, and operational excellence initiatives.
- Incident & Problem Management
- Act as the senior escalation point for critical incidents across systems, networks, and security technologies.
- Provide expert guidance on infrastructure issues including Windows/Linux servers, virtualization, storage, security appliances, and network technologies.
- Review RCA reports, lead problem management, and drive long-term remediation plans.
- Oversee complex change requests, service requests, patching cycles, and integration activities.
- Ensure alignment with ITIL processes and industry best practices.
- Oversee triaging, root cause analysis (RCA), and ensure timely closure of incidents and service requests.
- Ensure incidents, alarms and alerts are properly logged, categorised, prioritised, and tracked according to SLA.
- Liaise with customers, vendors, and internal stakeholders for critical incidents and troubleshooting.
- Systems & Network Operations Oversight
- Server, OS, storage systems monitoring and maintenance.
- Network operations including switches, routers, firewalls, VPN and load balancers.
- Ensure routine patching, backup operation, and system health checks.
- Guide the team on best practices for authentication, authorization, encryption and configuration management.
- Vendor & Stakeholder Management
- Coordinate with external vendors for maintenance, replacement, and enhancement activities.
- Ensure timely follow-up on open tickets, service disruptions and preventive maintenance.
- Communicate operational updates, risks, and issues to management and customers.
- Compliance, Documentation & Reporting
- Ensure operational documentation (SOP, checklist, incident report, RCA, inventory, access logs) are updated and accurate.
- Drive audit readiness for ISO, IT security audits, and internal governance requirements.
- Prepare periodic operational reports and dashboards for management.
- Job Requirements:
- Education & Experience
- Diploma/Degree in Computer Science, IT, Engineering or related fields.
- Minimum 10 - 15 years of hands-on experience in IT and cloud infrastructure or data centre operations.
- Minimum 5 years experience leading a technical team of Azure Infrastructure Engineer, operation centre, or shift-based environment.
- Technical Skills
- IT Infrastructure (Windows / Linux servers, virtualization, storage)
- Network technologies (L2/L3 concepts, routing, switching, firewalls)
- Azure Stack Hub / hybrid cloud environments
- Storage: S2D, SAN/NAS, Unity/VNX
- Windows HCI
- Kubernetes / SDN networking knowledge
- Security tools/products (e.g., BeyondTrust, SEPM, RSA, Palo Alto, Checkpoint, Fortigate, Safenet)
- Data Centre operations (monitoring tools, backup operations, tape management, hardware handling)
- Backup operations / Tape library / DR procedures
- Incident management and ITIL framework
- Knowledge of authentication, encryption, access management concepts
- Key Competencies
- Strong leadership, communication, and stakeholder management skills.
- Proven ability to manage crisis situations, critical escalations, and high-severity incidents.
- Strong analytical and problem-solving mindset.
- Ability to develop team members, build processes, and drive operational excellence.
- Able to support 24×7 operations when required (escalation or major incidents).
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at COMBUILDER PTE LTD? Share your experience