Senior AI Platform Engineer (Agentic IDP)
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
As a Senior Platform Engineer , you will sit at the intersection of infrastructure reliability and developer experience. You will not only maintain the scalability of our Kubernetes based cloud environment but also work collaboratively on the shift toward a self-service platform model. In this role, you will apply a software engineering mindset to infrastructure, leveraging Agentic AI to automate complex workflows and building the "Golden Paths" that eliminate manual bottlenecks. You aren't just managing clusters; you are architecting the software that manages them, ensuring our systems are resilient, cost-optimized, and inherently self-healing. This position will be hybrid out of our Toronto office.
Responsibilities
- IDP Architecture & Development: Lead the design and implementation of an Internal Developer Platform that abstracts infrastructure complexity, providing developers with self-service capabilities for environment scaffolding and deployment.
- Collaborative Infrastructure: Work with the team to maintain and scale multi-region Azure/AKS environments using Terraform and ArgoCD .
- Reliability Partnership: Collaborate with application engineers to implement deep observability via Datadog and establish meaningful reliability targets (SLOs).
- AI-Driven Automation: Build Python-based AI agents and tools that reduce team "toil" and streamline common operational tasks.
- Full-Lifecycle Ownership: Participate in on-call rotations and lead collective RootCause Analysis ( RCA ), turning every incident into a platform improvement or automated fix.
- Mentorship & Standards: Contribute to team code reviews, document best practices, and help establish standard patterns for deployment and security.
- What You Have:
- Bachelor's/Master's degree in computer science, a related technical field, or equivalent practical experience.
- 5+ years of professional Software Development experience with a deep understanding of the full SDLC, system design, and clean code principles.
- 7+ years of experience in Platform Engineering with a focus on running large-scale production environments.
- Proficiency in Python or Go for automation, tool building, and AI agent integration.
- Experience managing Kubernetes (AKS) and Infrastructure as Code (Terraform/Terragrunt) in a team-based environment.
Requirements
- Proven track record of building or contributing to an Internal Developer Platform (IDP) (e.g., Backstage, internal CLI tools, or custom portals).
- Experience collaborating with application teams to define SLIs/SLOs and improve service reliability.
- Demonstrated ability to mentor junior engineers and lead team-wide technical initiatives.
- Experience implementing GitOps (ArgoCD) and Policy-as-Code (Kyverno) to standardize team workflows.
- Familiarity with building Agentic AI tools to automate repetitive operational tasks (toil).
- Expertise in Datadog for observability, dashboarding, and incident response.
Benefits
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at guidepoint? Share your experience