Define and own the multiyear service roadmap (landing zones, guardrails, automation, observability, DR, FinOps) and prioritize investments that advance reliability, security, and cost efficiency.
Set SLO/SLA policy and institutionalize SRE practices (error budgets, reliability guardrails, automated remediation) through operating mechanisms and dashboards.
Establish and approve reference architectures, templates, and IaC standards; ensure adoption across teams and vendors.
Direct cloud governance (AWS Orgs/SCPs; Azure Mgmt Groups/Policy; tagging & cost allocation) and audit adherence.
Define the strategy for AI‑driven automation and observability (e.g., Bedrock, Azure AI Foundry) and sequence use‑cases that deliver measurable impact.
Service Delivery & Operations
Oversee onboarding of new client environments (landing zones, identity, networking, policy‑as‑code, baselines); approve readiness criteria and cutover.
Be accountable for day‑to‑day operations outcomes (patching, backups/restore, capacity, performance, monitoring/alerting) delivered by teams and vendors, with clear KPIs and review cadences.
Govern ITIL 4 processes (incident, change/CAB, problem, request) with targets for MTTR, change success rate, and RCA quality.
Ensure a 24/7 support model (runbooks/playbooks, on‑call rotations) is funded, staffed, tested, and reviewed via post‑incident governance; mandate chaos engineering and DR test schedules with executive readouts.
Champion integration of AI agents into operations; approve guardrails for automated remediation and anomaly detection.
Security & Compliance
Ensure defensible controls aligned to NIST 800‑53, ISO 27001, SOC 2, and (where applicable) FedRAMP, with continuous control monitoring and evidence management.
Oversee vulnerability management, detection/response, log aggregation, key/secret management, and PAM; set risk thresholds and remediation SLAs.
Direct continuous compliance via AWS Config, Azure Policy, CIS Benchmarks, Well‑Architected/CAF; review exceptions and drive closure.
Partner with Security/Compliance leadership on audits, findings, and corrective action plans.
Govern secure AI workload deployment and ensure adherence to emerging AI governance frameworks with NIST/ISO/FedRAMP alignment.
IaC Leadership
Make IaC the default operating model and hold teams accountable for compliant, automated delivery.
Set direction and approve architectures for Terraform/Bicep/Pulumi/CloudFormation/CDK; ensure secure automation and traceability.
Ensure CI/CD pipelines (GitHub Actions, Azure DevOps, AWS CodePipeline) implement policy‑as‑code, scanning, and segregation of duties.
Establish standards, reusable modules, and best practices; sponsor enablement and coaching programs for teams.
Champion AI‑assisted IaC validation and enforcement to accelerate secure deployments.
Observability
Be accountable for end‑to‑end observability (CloudWatch, Azure Monitor, Datadog/Prometheus/Grafana, APM/trace, SIEM) with tiered standards by criticality.
Approve golden signals and SLO dashboards; require auto‑remediation where feasible and review trend reports.
Direct AI‑driven observability strategy for predictive alerting and noise reduction; track alert fatigue and burn rate.
FinOps & Commercial Management
Own cost governance through budgets/forecasts, rightsizing, reservations/savings plans, anomaly detection, and showback/chargeback executed by FinOps and engineering.
Establish and enforce tagging & cost allocation policy; publish executive reports and drive optimization guardrails that protect reliability and security.
Track unit economics (cost‑to‑serve per workload/tenant) and improve over time through design standards and lifecycle management.
Stakeholder & Vendor Management
Build trusted relationships with executives, business leaders, application owners, and client stakeholders; run QBRs and roadmap reviews tied to outcomes.
Manage cloud provider and vendor partnerships; escalate and influence roadmaps/support aligned to your strategy.
People Leadership
Lead and develop a multi‑disciplinary organization (ops, engineering, admins, PMO, security, ITSM); recruit, coach, set goals, and succession‑plan.
Foster a culture of accountability, automation‑first, and co
Additional Information
As Director of Secure Cloud Hosting, you will own strategy, delivery, and continuous improvement for client-facing managed cloud services across AWS and Azure. You'll lead a global team to mature services in reliability, security/compliance, and cost efficiency, build trusted relationships with executives and application owners, and drive the transformation to Infrastructure as Code (IaC) as the default operating model through standards, reference patterns, and coaching.
You are responsible for ensuring key outcomes-system reliability, security integrity, compliance readiness, client satisfaction, and cost optimization-achieved through effective governance of teams and vendors.