Staff Software Engineer in Hardware Infrastructure Observability
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Nebius is looking for a Senior Software Engineer to join the Hardware Infrastructure Observability team. You're welcome to work from our office in Amsterdam. We build and run low-level monitoring for servers and data center engineering systems to ensure reliability at scale. We also design and operate maintenance and remediation systems that enable safe, predictable fleet-wide changes and keep the infrastructure healthy.
Responsibilities
- Design and develop services and agents that provide deep visibility into a large server fleet and DC engineering systems
- Evolve our metrics/aggregation/alerting pipelines and improve signals quality
- Build maintenance workflows and automation that keep fleets healthy
- Investigate incidents hands-on (including on-host debugging) and drive root-cause fixes
- Collaborate with hardware, networking, and DC operations to improve reliability
- We expect you to have:
- 5+ years of professional software engineering experience
- Excellent knowledge of Python and Golang or you are ready to quickly switch to these programming languages
- Strong Linux fundamentals
- Ability to write reliable code and and dig into complex problems
- Working proficiency in English
- It will be an added bonus if you have:
- Solid understanding of modern server architecture, and its components
- Experience with metrics/monitoring/alerting Prometheus-compatible stacks (like VictoriaMetrics)
- Good knowledge of computer networks
- Experience designing, developing, and running high-load distributed systems
- We conduct coding interviews as part of the process.
- Benefits & Perks:
- Competitive compensation
- Career growth and learning opportunities
- Flexibility and work-life balance
- Collaborative and innovative culture
- Opportunity to work on impactful AI projects
- International environment and talented teams
- What's it like to work at Nebius:
- Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportunity to shape the future of AI
- Equal Opportunity Statement:
- Applicants must be authorized to work in the country in which they apply and will be required to provide proof of employment eligibility as a condition of hire.
- If you need accommodations during the application process, please let us know.
Benefits
Additional Information
About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure. Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI. Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at nebius? Share your experience