Skip to main content
Back to jobs

Sr Platform Engineer-1

External
flexential logoFlexential · Denver Corp, CO
$150K–$165K/yrFull-timeOn-siteToday
AnsibleArgoCDAWSAzureBootstrapCI/CD
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Benefits

Health insuranceVision insurance

Additional Information

Job Description: The Senior Platform Engineer is a hands-on engineering role on a platform development team responsible for building and operating Flexential's IT platforms including observability, devops, ITSM incident and release mgmt, and Integrations technologies. This role develops and manages critical platform subsystems for high availability, operational resiliency, security and scalability utilizing native-AI enablement for all outcomes. This is an individual contributor role with significant technical ownership and direct impact on critical Flexential technology roadmap You will work across infrastructure, automation, and application layers - deploying Kubernetes workloads, authoring Terraform modules, building Ansible playbooks, and building GitLab pipelines that other engineers depend on daily. Key Responsibilities and Essential Job Functions: Design, develop and operationally manage automated, resilient, high availability, self-healing, secure platforms with native-AI capabilities for IT needs, serving both internal as well as customer business capabilities De velop , and manag e the Observability OpenTelemetry Central Backend Stack: Grafana Enterprise, Mimir, Loki, Tempo, and Alertmanager on Kubernetes/RKE2 via Helm and GitLab CI -CD . Build and ma nage iaC and CI-CD for automated provisiong and deployment, including Terraform modules for Infra/ VM/storage provisioning, Ansible AWX playbooks for OS/ A pp bootstrap, ArgoCD and Helm for Kubernetes configuration . Develop and manage OpenTelemetry Prometheus scrape profile library including SNMP exporters, REST API exporters, and cloud provider exporters (CloudWatch, Azure Monitor, GCP) for multiple device classes. Develop AIOps capabilit ies on platforms for e.g Observability use-cases : anomaly detection integrations, event correlation rules in Alertmanager , and synthetic monitoring patterns to reduce alert noise. Configure and maintain Zabbix auto-discovery: network range scanning, device classification, and Prometheus service discovery integration. Build and harden Edge Stack deployments (Prometheus + OTel collector) per data center site using GitOps templates. Integrate Alertmanager with ServiceNow: webhook routing, ticket enrichment, auto-close logic, and escalation policy configuration. Maintain platform security: Conjur /CyberArk secret injection at runtime, mTLS between stack components, RBAC in Grafana Enterprise. Author and maintain Grafana dashboards in JSON/GitLab - facility overview, network health, RED metrics, application telemetry. Mentor mid-level engineers, lead code reviews, and establish engineering standards for the team. Represent platform engineering in cross-functional architecture reviews and executive-level program updates. Perform other duties as required and assigned Required Qualifications: DevOps / Automation - 5+ years in a production environment , Kubernetes (RKE2/k3s), Helm chart deployment, system services, Docker/ container LGTM Stack Development and Configuration - 4 + years : Grafana, Mimir, Loki, Tempo configuration, tuning, dash- boarding and production operation s ; Prometheus required Senior-level Python / Scripting frameworks - 5+ years, Automation scripts, exporter development, GitLab pipeline scripting, REST API integrations GitOps / CI/CD - 5+ years, GitLab CI/CD pipeline authoring; Terraform and Ansible as primary IaC tools; ArgoCD or Flux preferred AIOps / Observability Engineering - 2+ years , Alertmanager rule authoring, anomaly detection integration, event correlation, noise reduction techniques Working i nfrastructure (Linux/VM) management knowledge - 5+ years, Linux administration, VMware vCenter/ VCF experience , Netapp storage management , network fundamentals (SNMP, TCP/IP) Secrets Management - 2+ years , CyberArk/ Conjur , HashiCorp Vault, or equivalent - runtime secret injection patterns M inimal travel may be required Preferred Skills: Experience and/or knowledge of ITSM processes and workflow automation e.g. Incident & Response Mgmt (IRM), Release mgmt. , ServiceNow ITSM integration, alert routing, escalation policy design, SLA-driven on-call workflows Hands- on experience or working knowledge of Boomi integrations PaaS ( iPaaS ) technologies Experience working with BAS / BMS systems in a Datacenter / OT environment. Hands-on experience working with AWS products in a Well-architected Framework and multi-account model to develop various compute, storage, network iaaS and PaaS services for IT applications. Base Pay Range : Annualized/Hourly salary range offered for this position is estimated to be $150,000 - $165,000. However, the actual pay range depends on each candidate's experience, location, and qualifications . Not meeting every single requirement? No problem! We are looking for candidates who possess unique skills that set them apart from the rest. If you're enthusiastic about this role and believe you have the skills and abilities that would make you suc


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at flexential? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect