Cloud Infrastructure Developer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Benefits
Additional Information
Please refer to the How to Apply for a Job (for External Candidates) job aid for instructions on how to apply. If you are an active McGill employee (ie: currently in an active contract or position at McGill University), do not apply through this Career Site. Login to your McGill Workday account and apply to this posting using the Find Jobs report (type Find Jobs in the search bar). Position Summary: The Canadian Centre for Computational Genomics (C3G) at McGill University builds open-source Research Data Management (RDM) solutions that support every stage of the genomics and health data lifecycle. Our portals, APIs, databases and tools are the infrastructure behind national genomics and health data sharing in Canada. Our projects include: The Pan-Canadian Genome Library (PCGL) The Terry Fox Marathon of Hope Cancer Centre Network (MoHCCN) The International Human Epigenome Consortium (IHEC) The Quebec COVID-19 Biobank (BQC19) We also provide bioinformatics analysis software and high-performance computing services to the life sciences research community, including widely used analysis pipelines. A Cloud Infrastructure Developer will manage and evolve the Kubernetes infrastructure powering the Pan-Canadian Genome Library ( https://genomelibrary.ca/ ), Canada's national genomics data platform. The PCGL platform encompasses a Research Portal, a Data Access Committee (DACO) portal, a clinical and genomic data submission service, and supporting data infrastructure. All of this runs on an institutional cloud environment and serves researchers, clinicians, and data managers across Canada. This infrastructure handles sensitive genomic data where reliability, security, and compliance are non-negotiable. The incumbent will have significant ownership over the cluster's architecture and operations, working within a small, focused team rather than a large platform organization. The incumbent will be expected to develop strong intuitions about what the infrastructure needs, proactively surface risks, and contribute meaningfully to infrastructure decisions in an environment where one's technical judgment shapes outcomes and close collaboration is the norm. Under the supervision of the Data Team Lead, the Cloud Infrastructure Developer will deploy, manage, secure, and evolve the Kubernetes clusters supporting the PCGL platform on the SecureData4Health (SD4H) institutional cloud. This role works in close daily collaboration with the SD4H DevOps team, coordinating on infrastructure provisioning, network configuration, and cloud resource management. The Cloud Infrastructure Developer is responsible for ensuring high availability, reproducibility, and compliance with data governance requirements across all environments. Primary Responsibilities: Manage day-to-day operations of production, staging, and development Kubernetes clusters, including upgrades, capacity planning, node management, and incident response. Design and implement infrastructure-as-code for cluster provisioning and configuration, ensuring reproducibility and auditability. Define and enforce network policies, RBAC, secrets management, and security hardening practices across cluster workloads. Manage storage solutions for genomic data, including persistent volume provisioning, backup and disaster recovery strategies, and data retention policies. Build and maintain CI/CD and GitOps pipelines for application deployment, ensuring smooth, low-downtime releases in collaboration with the development team. Monitor cluster health, set up alerting, and conduct post-mortems on incidents to continuously improve reliability. Maintain clear operational documentation and runbooks, and contribute to the team's knowledge of infrastructure best practices. Use an issue tracking system to document tasks, incidents, and their resolution status. Other Qualifying Skills and/or Abilities Hard skills: Demonstrated hands-on experience administering Kubernetes clusters in production environments is mandatory. Examples of past infrastructure work, either via a portfolio or references, are highly recommended. Proficiency with infrastructure-as-code and GitOps tooling (e.g. Terraform, Helm, Kustomize, ArgoCD, Flux, or equivalent). Experience with container runtimes and image management (Docker, Podman, or equivalent). Familiarity with networking fundamentals as they apply to Kubernetes (ingress controllers, CNI plugins, DNS, TLS). Experience with monitoring and observability stacks (e.g. Prometheus, Grafana, Loki, or equivalent). Undergraduate degree in computer science, engineering, systems administration, or a related field. Soft skills: Ability to make sound infrastructure decisions collaboratively, contributing technical judgment clearly within a team context, while knowing when to escalate. Interest in developing and operating open-source solutions. Attention to detail, strong communication skills, and ability to work in a highly collaborative environment. Capable of managing