Skip to main content
Back to jobs

Senior AI Infrastructure & Networking Engineer

External
GENESIS NETWORKS PTE LTD logoGenesis Networks · Kaki Bukit Avenue 1, Singapore
S$48K–S$55K/yrFull-timeUnknownToday
Information Technology
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • AI Fabric Architecture & Deployment: Design, build, and optimize high-throughput, ultra-low-latency East-West compute networks using NVIDIA Spectrum-X Ethernet platforms (Spectrum-4 ASICs) and/or NVIDIA Quantum-X800 InfiniBand switching .
  • Performance Tuning for Lossless Networking: Configure and fine-tune critical Layer 2/3 lossless transport mechanisms, including Remote Direct Memory Access over Converged Ethernet ( RoCE v2 ), Priority Flow Control ( PFC ), Explicit Congestion Notification ( ECN ), and DCQCN .
  • Rail-Optimized Topologies: Implement and maintain non-blocking, multi-plane, full fat-tree network topologies mapped to 8-GPU server architectures to maximize collective communication performance via NCCL (NVIDIA Collective Communications Library).
  • SmartNIC & DPU Management: Deploy and manage high-speed compute network interfaces, including ConnectX-8 SuperNICs (800 Gb/s) and BlueField-3 DPUs for isolated infrastructure management, storage acceleration, and multi-tenant security.
  • Full-Stack Orchestration & Automation: Drive infrastructure-as-code deployments using Ansible and Terraform . Initialize and monitor the NVIDIA Network Operator within core Kubernetes orchestration layers.
  • Telemetry & Validation: Utilize deep network telemetry tools such as NVIDIA NetQ and "What Just Happened" (WJH) to stream real-time switch diagnostics. Conduct line-rate cluster benchmarking using ib_write_bw and ib_write_lat to eliminate physical layer bottlenecks.
  • Required Technical Skills &Qualifications
  • Education: Bachelor's or Master's degree in Computer Science, Network Engineering, Systems Engineering, or a related technical discipline.
  • AI Networking Expertise: Proven track record of configuring RoCE v2, adaptive routing, and traffic optimization specifically for machine learning/HPC workloads.
  • Hardware Familiarity: Deep understanding of high-density scale-up and scale-out systems (NVIDIA HGX/DGX architectures, PCIe switching, OSFP/QSFP112 optical and copper assemblies).
  • Software & Cluster Management: Experience with cluster deployment suites like NVIDIA Mission Control , Base Command Manager , Run:ai, or similar enterprise MLOps frameworks.
  • Routing Protocols: Strong proficiency with advanced datacenter networking protocols, particularly eBGP IPv6 unnumbered underlays and EVPN/VXLAN overlays for multi-tenant isolation.
  • Cabling & Layer 1 Validation: Experience managing complex structured fiber trunking (MPO-12/MPO-24 APC) and executing layer-1 diagnostics (ibdiagnet, iblinkinfo).
  • Preferred Certifications
  • NVIDIA Certified Professional - AI Networking (NCP-AIN) (Highly Preferred)
  • NVIDIA Certified Expert - Cloud End-to-End Fabric (NCE-CEF)
  • Advanced networking tracks from major vendors (e.g., CCIE, JNCIE, or Nokia Service Routing Architect) combined with proven data center fabric experience.

Benefits

Opportunity to work with first-of-its-kind, world-class AI supercomputing technologies (NVIDIA Blackwell Ultra).High-impact role shaping the foundational architecture for enterprise generative AI and large-scale LLM initiatives.Competitive salary, comprehensive benefits package, and continuous learning paths for advanced AI operations certifications.

Additional Information

We are seeking an expert Senior AI Infrastructure & Networking Engineer to lead the architecture, deployment, and optimization of our next-generation AI Factory. In this role, you will be responsible for building and scaling high-density GPU supercomputing clusters (up to 512+ nodes) featuring NVIDIA Blackwell UltraB300 systems. You will bridge the gap between heavy physical infrastructure (liquid cooling/busbar power) and advanced logical fabrics, ensuring predictable, line-rate, and lossless transport for massive generative AI training and reasoning workloads.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at GENESIS NETWORKS PTE LTD? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect