Skip to main content
Back to jobs

DevOps Engineer (AI Inference)

External
Gcore logoGcore · Poland, Cyprus
Full-timeRemoteToday
AnsibleBashCI/CDDNSGitGrafana
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Design, develop, and maintain infrastructure for AI inference workloads, including GPU scheduling, model deployment pipelines, and data access patterns in on-prem environments
  • Build and manage monitoring and observability tools for AI inference platforms, including dashboards, alerts, and runbooks for model health and system performance
  • Collaborate with ML engineers and platform teams to design system architecture for AI workloads, integrate inference runtimes, and test performance at scale

Requirements

  • Strong understanding of Kubernetes architecture, including CNI, CSI, operators, ingress/gateway, and control plane components.
  • Hands-on experience operating and troubleshooting production Kubernetes clusters.
  • Strong Linux and networking troubleshooting skills, including DNS, routing, firewalling, TLS, MTU, connectivity and performance issues.
  • Ability to develop automation and operational tooling using Python, Go, or Bash.
  • Experience with Terraform, Ansible, or similar IaC/configuration management tools.
  • Experience with VictoriaMetrics/Grafana or similar monitoring, alerting, and troubleshooting tools.
  • Strong experience with Git-based workflows and CI/CD pipelines.
  • Familiarity with Cluster API or similar Kubernetes cluster lifecycle management technologies.
  • Hands-on operation or administration of Slurm clusters.
  • Knowledge of Argo CD, GitOps workflows, Helm, or Helmfile.
  • Background working with managed platforms, PaaS, or cloud services.
  • Exposure to bare metal, GPU, HPC, or other high-performance computing environments.
  • Familiarity with the NVIDIA GPU stack, RDMA/InfiniBand, or high-performance networking.
  • Knowledge of OpenStack or similar cloud infrastructure platforms.
  • Hands-on experience developing Kubernetes operators or controllers.

Benefits

At Gcore, we want you to do your best work and enjoy the journey. Our benefits are designed to support your growth, well-being, and life beyond work:Competitive compensationFlexible working hours and hybrid or remote options, depending on your roleWork from anywhere in the world for up to 45 days per yearPrivate medical insurance for you and your family*Extra paid vacation and sick leave days*Support for life's important moments and celebrationsLanguage courses to help you connect and growModern, welcoming offices with snacks, drinks, and entertainment*Team sports and social activities**Benefits may vary depending on your location.Equal Opportunity EmployerWe provide equal opportunity to all applicants without regard to race, color, religion, sex, sexual orientation, age, gender identity, gender expression, national origin, disability, or any other legally protected characteristics.Health insurancePaid time offRemote work optionsFlexible schedule

Additional Information

As a DevOps Engineer, you will be responsible for designing, deploying, and maintaining infrastructure and services that enable scalable and secure AI inference workloads on-premises.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Gcore? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect