Staff Software Engineer, Infrastructure

External

Withwisdom · Worldwide

Full-timeRemote2d ago

AWSDatadogDocumentationGCPHIPAANode.js

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

About the role

The roadmap isn't handed to you here. You'll help write it - and you'll be the reason it stays up. As a Staff Software Engineer focused on Infrastructure at Wisdom, you'll set the technical direction for reliability across the company - and own the systems behind the systems: the deploy pipeline, the observability, the capacity controls, and the failure-handling that decide whether our agentic billing infrastructure quietly does its job or pages someone at 2am. This is a force-multiplier role on a small, high-trust team. Your job isn't just to fix what breaks; it's to make the whole organization operate at a higher reliability bar - to build the practices, the guardrails, and the instincts that mean fewer things break in the first place, and the team can handle the ones that do without you in the room. Wisdom's stack is TypeScript, Node.js, React, Postgres, and AWS, with LLM-driven agents (Mastra, Anthropic) making high-stakes billing decisions in production. The problems we're solving - keeping inconsistent insurance integrations alive, making AI pipelines fail safe instead of failing loud, running HIPAA-compliant infrastructure that genuinely can't go down - are legitimately hard. We'd rather have someone energized by making things not break than someone who merely tolerates being paged when they do. In your first year, you'll have defined what reliability means at Wisdom and built the function to deliver it: a real observability and SLO practice, an incident process that runs without heroics, agentic pipelines that degrade gracefully instead of taking prod down with them, and a team that's measurably better at operating production because of how you've raised the bar. This is a fully remote role reporting directly to the Head of Engineering.

Responsibilities

Set the reliability strategy for the platform - SLOs, error budgets, and the operating standards for services that bill real money for real practices, and the technical roadmap to get us there
Own observability end-to-end - tracing, metrics, logging, and alerting (Datadog) that surfaces problems before users do, not after - and make it the default so any engineer can lead an incident, not just the person who wrote the code
Harden the integration surface with dental insurance carriers and practice management systems (Dentrix, Eaglesoft) - poorly documented, inconsistent, and the first thing to buckle under load
Own deploy and release engineering - fast, safe, reversible deploys; infrastructure as code (Terraform); and the unglamorous discipline that lets a Series A ship many times a day without breaking things
Build the incident practice, not just lead incidents - the on-call rotation, the runbooks, the blameless post-incident culture, and the follow-up discipline that turns outages into permanent fixes the whole team owns
Raise the bar through others - set technical standards via code review, architecture guidance, and documentation that actually gets used, and level up how the entire engineering team reasons about reliability
Take on the ambiguous, undefined, company-level reliability problems and drive them to resolution without waiting for permission or a perfect brief

Requirements

8+ years running production systems , with a track record of operating at staff/principal scope - you've owned reliability for systems where downtime had real consequences and left them measurably better
You've operated at scale under pressure - services that had to stay up, incidents you led to resolution, and reliability practices you established that outlived your tenure and changed how teams worked
You multiply the people around you - your impact shows up in what others ship reliably, not only in what you touch directly. You've set standards, mentored engineers, and driven technical decisions across teams without needing the authority to mandate them
Deep AWS (or GCP) experience - you've deployed, op

Benefits

Dental insuranceRemote work options

Additional Information

About Wisdom Wisdom blends industry expertise with advanced technology to make dental practices work better for everyone involved. We believe dentistry is about people, and we exist to make the future of dentistry stronger and more sustainable for dentists, their teams, and the patients they serve. We match administrative teams with expert billers and custom-built technology to take on the heavy lifting of dental billing while maximizing dentists' time in-office, and their bottom line. Coming from a fresh $21M Series A round of funding we are looking for exceptional candidates to help us build a category-defining company. Wisdom has employees across the US.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at withwisdom? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect