Skip to main content
Back to jobs

Solution Architect - (AI Infrastructure & Hybrid Cloud)

External
gruve logoGruve · Pune, India
Full-timeOn-site1mo ago
AnsibleAWSAzureBGPCI/CDCompliance
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Architect and Design Enterprise OpenShift Solutions: Lead the high-level design (HLD) and low-level design (LLD) for multi-tenant Red Hat OpenShift and Kubernetes clusters across on-prem and hybrid cloud environments.
  • Define the technology stack, standards, and blueprints for deploying AI solutions across global, multi-region public clouds (AWS/Azure/GCP) and diverse on-premise hardware.
  • Oversee the successful end-to-end rollout of critical services including AI SOC, OpenShift AI, and AI-based Cybersecurity Log Optimization.
  • Drive Network DevOps Strategy: Define and standardize the automation roadmap using Ansible, Terraform, and Python to achieve "Zero-Touch" infrastructure provisioning and configuration.
  • Lead Customer & Stakeholder Engagement: Act as the primary technical consultant for global clients, leading design workshops, architecture validation, and executive-level technical reviews.
  • Optimize Hybrid Infrastructure: Oversee the seamless integration of OpenShift with physical networking (Cisco ACI, VXLAN) and virtualized platforms (RHEL-V, VMware ESXi).
  • GPU & Hardware Orchestration: Design and manage hardware acceleration using the NVIDIA GPU Operator and Node Feature Discovery (NFD). Implement Multi-Instance GPU (MIG) and time-slicing to optimize resource utilization across multi-tenant clusters.
  • Establish CI/CD Governance: Architect robust CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions) for infrastructure and application delivery, ensuring compliance and security are baked into the workflow.
  • Lead Observability & Reliability: Design comprehensive monitoring and logging architectures using Prometheus, Grafana, and ELK stack to ensure 99.99% availability of cluster services.
  • Mentorship & Technical Leadership: Guide and mentor L2/L3 engineers, providing expert-level escalation support and establishing best practices for the DevOps and Network teams.
  • Innovation & R&D: Evaluate and introduce emerging technologies such as Advanced Cluster Management (ACM), Advanced Cluster Security (ACS), and Cloud-Native Networking (OVN-Kubernetes).

Requirements

  • Bachelor's or master's degree in computer science, Engineering, or a related field
  • 10+ years of progressive experience in systems architecture, infrastructure engineering, or network DevOps.
  • Expert/Architect-level proficiency in OpenShift (RHOS), and Kubernetes architecture in large-scale production environments.
  • Experience architecting hybrid-cloud OpenShift solutions involving AWS (ROSA), Azure (ARO), or Google Cloud.
  • Proven track record in Automation & IaC: Mastery of Ansible, Terraform, and Git-based workflows to manage complex infrastructures.
  • Deep understanding of Linux (RHEL) Internals: Mastery of kernel networking, storage drivers (CSI), and container runtimes (CRI-O).
  • Strong Network Background: In-depth knowledge of BGP, VXLAN, and EVPN as they apply to connecting containerized workloads to physical data center fabrics.
  • Experience in "Architecture as Code": Ability to develop and maintain compliance artifacts, design validation reports, and automated documentation.
  • Excellent Leadership Skills: Demonstrated ability to manage high-stakes customer relationships and lead cross-functional technical teams
  • Certifications: Red Hat Certified Architect (RHCA), Red Hat Certifie

Benefits

Vision insurance

Additional Information

About Gruve Gruve is an innovative software services startup dedicated to transforming enterprises to AI powerhouses. We specialize in cybersecurity, customer experience, cloud infrastructure, and advanced technologies such as Large Language Models (LLMs). Our mission is to assist our customers in their business strategies utilizing their data to make more intelligent decisions. As a well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks. Position Summary: We are seeking a Solution Architect - (AI Infrastructure & Hybrid Cloud) to lead the strategic design, architecture, and deployment of large-scale, enterprise-grade Red Hat OpenShift and Kubernetes environments. As a technical authority at the L4 level, you will be responsible for defining the blueprint of our cloud-native infrastructure, ensuring it is secure, scalable, and highly automated. The ideal candidate acts as the bridge between traditional infrastructure and modern DevOps, serving as the lead design authority for global clients. You will collaborate with Network and Firewall Architects to build a unified fabric where containerized workloads, legacy data centers, and hybrid cloud environments coexist seamlessly through advanced automation and Infrastructure-as-Code (IaC).


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at gruve? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect