Skip to main content
Back to jobs

Senior Site Reliability Engineer (SRE)

External
tulip logoTulip · Somerville, MA
$150K–$190K/yrFull-timeOn-site2w ago
Incident ResponseKubernetesObservabilityOpenTelemetryPrometheusTypeScript
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Mentor and evangelize on observability best practices, SLIs/SLOs, and reliability culture across engineering teams.
  • Contributing to and maintaining Tulip's triage & remediation processes for both Humans and AI
  • Help architect our systems for growth and scale.
  • Implement internal tools to automate common developer tasks.
  • Perform incident response and debug production issues across the entire stack.
  • Design, build, and maintain the core infrastructure used by all of Tulip's engineering teams.
  • Work to automate detection and resolution of recurring issues.
  • Key Collaborators:
  • Engineering team, Edge team, DevOps team, Hardware team
  • Working At Tulip
  • We know even great candidates experience imposter syndrome. Even if you don't match every requirement, applying gives you the opportunity to be considered.
  • We're building a strong, diverse team that values hard work, families, and personal well-being. Benefits of working with us include:
  • Direct impact on product and culture
  • Company equity
  • Competitive benefits package including Health, Dental, Vision, Short-term Disability, Long-term Disability, Life Insurance, AD&D Insurance, Flexible Spending Account (FSA), Commuter Benefits, Parental Leave, and 401(K)
  • Flexible work schedule and unlimited vacation policy
  • Virtual company events and happy hours
  • Fitness subsidies
  • It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.
  • Please note that we may use AI-based tools to support parts of our hiring process. All data processing is carried ou

Benefits

Health insuranceDental insuranceVision insurance401(k)Paid time offFlexible scheduleEquity / stock optionsParental leave

Additional Information

This role is located in Somerville, MA - We are a hybrid work environment and are in the office 3+ days/per week. Tulip , the leader in AI-native frontline operations, is helping companies around the world equip their workforce with composable, connected apps, leading to higher quality work, improved efficiency, and end-to-end traceability across operations. Tulip's cloud-native, no-code platform, powered by embedded AI, is driving the digital transformation of industrial environments through composable, human-centric solutions that go beyond disrupting the Manufacturing Execution System (MES) category. A spinoff out of MIT, Tulip is headquartered in Somerville, MA, with offices in Germany, Hungary, Singapore, and Israel. Tulip has been recognized as a World Economic Forum Global Innovator, a 2024 Deloitte Technology Fast award winner, one of Energage's Top Workplaces USA, and one of Built In Boston's "Best Places to Work" and "Best Midsize Places to Work." About You: You have experience building and maintaining stable infrastructure at scale. You can reason about systems - their edge cases, failure modes, and life cycles. You're excited about setting the technical agenda and coming up with novel, broad ideas. You regularly keep up with the newest AI advancements in the realm of Observability & Monitoring, and experiment with emergent ways of work. You can debug complex issues across the entire stack. You're opinionated about the tools and frameworks that work best. You enjoy building for other engineers equally, if not more, than building for a customer. You know what a good SLA looks like, and can teach others how to spot one. What skills do I need? You have 5+ years of experience working with open source Observability tools (e.g. LGTM stack) You have hands-on experience instrumenting distributed systems using OpenTelemetry and managing metrics pipelines with Prometheus at scale. You have hands on experience developing and distributing Claude Skills, Gemini Gems, or any other generic AI processes and are able to iterate on their efficacy You have experience working with time-series data, ideally using promQL You can pick up new languages/frameworks with ease. We currently run Go and Typescript services on Kubernetes. You can communicate as well as you can code. You understand the value of discussion and work best in a team that champions clear and frequent communication.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at tulip? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect