Skip to main content
Back to jobs

Senior Software Engineer, Observability

External
CoreWeave logoCoreweave · New York, NY
$139K–$220K/yrFull-timeOn-site1w ago
ElasticsearchGrafanaHelmKafkaKubernetesMicroservices
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Join CoreWeave's Observability team, responsible for building the systems that give our customers and internal teams unparalleled visibility into complex AI workloads. Our team empowers engineers to understand, troubleshoot, and optimize high-performance infrastructure at massive scale.

Requirements

  • 5+ years of experience in software or infrastructure engineering with a focus on designing, building, and operating large-scale distributed systems in production.
  • Proficient in Go or Python with experience writing clean, testable, and resilient production code.
  • Hands-on experience with Kubernetes, containerization, and microservices architectures in production environments.
  • Proven ability to design and deliver scalable, robust systems with high-quality code, automated testing, and progressive release strategies.
  • Skilled in decomposing complex problems in distributed architectures into manageable, well-scoped work.
  • Familiar with Helm and YAML-based configurations for deploying and managing services, including templating, automation, and infrastructure-as-code practices.
  • Experience participating in on-call rotations for critical production systems.
  • Bachelor's degree in Computer Science, Electrical Engineering, Mathematics, or related field.
  • Preferred:
  • Experience designing, operating, or scaling logging, metrics, or tracing platforms (e.g., Loki, ClickHouse, Elasticsearch, Prometheus, VictoriaMetrics, Grafana, Thanos).
  • Familiarity with data streaming systems for observability pipelines (e.g., Kafka, Kafka Connect).
  • Experience automating infrastructure provisioning using tools like Terraform.
  • Knowledge of OpenTelemetry for unified telemetry collection and instrumentation.
  • Exposure to modern AI workloads and GPU-based infrastructure, including large-scale training and inference.
  • You love building systems that provide deep visibility into complex, high-scale environments.
  • You're curious about observability, telemetry, and platform performance at massive scale.
  • You're an expert in distributed systems and engineering resilient, scalable software.
  • Why CoreWeave?
  • Be Curious at Your Core
  • Act Like an Owner
  • Empower Employees
  • Deliver Best-in-Class Client Experiences
  • Achieve More Together

Benefits

The range we've posted represents the typical compensation range for this role. To determine actuaVision insuranceEquity / stock optionsPerformance bonus

Additional Information

CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at www.coreweave.com .


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at CoreWeave? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect
Senior Software Engineer, Observability at Coreweave