Skip to main content
Back to jobs

Senior Production Engineer

External
coreweaveu logoCoreweaveu · Warsaw, Poland
Full-timeOn-site3w ago
AnsibleAWSBashComplianceGCPGDPR
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


About the role

You will assist in incident response efforts by helping identify and resolve service disruptions quickly while documenting root cause analysis (RCA) and post-incident reviews (PIRs). You will monitor system performance and health using tools like Prometheus and Grafana to identify potential incidents and implement automation to reduce manual intervention. This position involves collaborating across teams to improve platform reliability and resilience while refining incident response playbooks. As you gain experience, you will take on more complex responsibilities in incident management and system reliability.

Requirements

  • 5+ years of experience in cloud operations, site reliability engineering (SRE), or related technical roles.
  • Strong understanding of cloud platforms (e.g., Kubernetes, AWS, GCP) and cloud infrastructure.
  • Expertise in scripting or using automation tools such as Python, Bash, Terraform, or Ansible.
  • Good familiarity with incident management practices and frameworks like ITIL or SRE best practices.
  • Experience with monitoring and alerting tools including Prometheus and Grafana.
  • Strong communication skills with the ability to work in a fast-paced, high-pressure environment.
  • Preferred:
  • Experience working with Kubernetes, containerization, and distributed systems.
  • Knowledge of change management processes and post-incident analysis.
  • Experience with automated systems or self-healing infrastructure.
  • You love to maintain the reliability and stability of high-scale cloud infrastructure.
  • You're curious about automation and process improvements to enhance incident detection.
  • You're an expert in cloud operations and incident management frameworks.
  • Why CoreWeave?
  • Be Curious at Your Core
  • Act Like an Owner
  • Empower Employees
  • Deliver Best-in-Class Client Experiences
  • Achieve More Together
  • To fulfill our obligation to protect client data, successful applicants offered employment with CoreWeave will be required to complete a basic criminal record check, conducted in compliance with GDPR. Employment offers are conditional upon receiving satisfactory check results

Benefits

In addition to a competitive salary, we offer a variety of benefits to support your needs, including:Health insuranceEquity / stock optionsPerformance bonus

Additional Information

CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at www.coreweave.com . We're proud to be a Living Wage accredited Employer.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at coreweaveu? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect