System Engineer (Token Factory)
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Token Factory is a part of Nebius Cloud, one of the world's largest GPU clouds, running tens of thousands of GPUs. We are building an inference platform that makes every kind of foundation model - text, vision, audio, and emerging multimodal architectures - fast, reliable, and effortless to deploy at massive scale.
Responsibilities
- Develop and optimize low-level kernels and runtime components for AI inference
- Improve performance of inference engines GPU platforms
- Profile and debug system-level and hardware-level performance issues
- Integrate support for new hardware architectures (Hopper, Blackwell , Rubin )
- Collaborate with ML and backend teams to optimize end-to-end execution
- Required Qualifications:
- Strong proficiency in C++ , OR expertise in GPU programming with a focus on low-level high-performance coding and memory management
- Experience in GPU programming or systems-level software development , e.g. operating system internals, kernel modules, or device drivers
- Hands-on experience with profiling and debugging tools to identify performance issues on both CPUs and GPUs, and the ability to optimize code based on those findings.
- Solid understanding of CPU/GPU architecture and memory hierarchy
Requirements
- Experience with GPU computing programming : CUDA, ROCm , CUTLASS, Cute, ThunderKittens , Triton, Pallas, Mosaic GPU
- Familiarity with ML inference runtimes (e.g. TensorRT , TVM)
- Knowledge of Linux internals, drivers, or compiler toolchains
- Experience with tools like perf, VTune , Nsight, or ROCm profiler
- Familiarity with popular inference engines (e.g. such as vLLM , sglang , TGI)
- We conduct coding interviews as part of the process.
- Benefits & Perks:
- Competitive compensation
- Career growth and learning opportunities
- Flexibility and work-life balance
- Collaborative and innovative culture
- Opportunity to work on impactful AI projects
- International environment and talented teams
- What's it like to work at Nebius:
- Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportunity to shape the future of AI
- Equal Opportunity Statement:
- Applicants must be authorized to work in the country in which they apply and will be required to provide proof of employment eligibility as a condition of hire.
- If you need accommodations during the application process, please let us know.
Benefits
Additional Information
About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure. Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI. Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at nebius? Share your experience