Infrastructure Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
Our mission is to make the world programmable. Sight is one of the key ways we understand the world, and soon this will be true for the software we use, too. We're building the tools, community, and resources needed to make the world programmable with artificial intelligence. Roboflow simplifies building and using computer vision models. Today, over 1M+ developers, including those from half the Fortune 100, use Roboflow's machine learning open source and hosted tools. That includes counting cells to accelerate cancer research, improving construction site safety , digitizing floor plans , preserving coral reef populations , guiding drone flight , and much more . Roboflow is supported by great customers and investors, having raised over 63 million from Y Combinator, Google Ventures, Craft Ventures, Sam Altman, Lachy Groom, amongst other leading software investors. Roboflowers love building great things with passionate teammates. We value ownership, accountability, and a bias toward action-whether it's a big initiative or a small fix. You're naturally curious, hands-on with new tech (maybe even played with ChatGPT or AI products early on), and prefer to show your work over talking about it. Many of us have founder mindsets and thrive in Roboflow's high-autonomy environment-some even started as side hustlers in school.
Responsibilities
- Skillset
- Production experience with Kubernetes : Building and managing containerized applications at scale.
- Infrastructure-as-Code (IaC) : Using Terraform, Helm charts, bash scripting, and Python to automate everything.
- Scale & Site Reliability : Operating, monitoring, and scaling large-scale applications (especially in ML/AI) in AWS and/or GCP.
- Development Skills : Proficiency in Node.js and Python, with the ability to collaborate with full-stack developers on designing and operating SaaS applications.
- ML/Big Data Ops : Hands-on experience with the infrastructure required for machine learning at scale (GPUs, Docker, Kubernetes) and familiarity with libraries like PyTorch or Tensorflow.
- CI/CD Automation : Experience with tools like GitHub Actions or Spacelift to build and deploy code efficiently.
- Pragmatic Security : Awareness of security best practices for cloud operations and how they can be applied to startup environments.
- AI-Native Engineering: Leveraging LLMs and AI tools to accelerate the development lifecycle-from writing and refactoring code to identifying security vulnerabilities and optimizing infrastructure costs.
- A Glimpse of Your Work
- No two days will be the same. Your tasks will be a blend of strategic projects and hands-on implementation. Examples include:
- Running and optimizing a high-availability machine learning inference service.
- Collaborating with customer security teams to ensure secure integration.
- Developing creative IaC solutions to scale our plat
Requirements
- Primarily, you like to make great things with passionate colleagues. You are someone that likes to own outcomes, not only inputs. You're motivated by having responsibility and accountability. You're eager to 'do the work,' big and small.
- You're curious and learning about new technologies, perhaps an early tinkerer with MLOps products. You show more than you tell.
- You're motivated by the question, "How can I improve this?" and have a track record of doing so, even in ways adjacent to your role. Much of our current team is made up of former founders and thrive in the level of autonomy at Roboflow. Maybe you had a side hustle in high school or college.
Benefits
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at roboflow? Share your experience