Site Reliability Engineer

External

Thinkbrg · Remote

$130K–$160K/yrFull-timeRemoteToday

AWSAzureCI/CDDatadogGCPGitHub

Cover Letter Connect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role

Responsibilities

Design, implement, and maintain scalable and reliable systems in cloud environments such as Azure Cloud Services.
Experience with CI/CD Platforms (GitHub Actions, GitLab CI)
Provide operational support for full-stack software applications.
Increase system resilience with expert-level coding, bulletproof release, and change management skills.
Develop service-level indicators and objectives to automate release validation.
Improve automation and increase the system's self-healing capability.
Collect operating system data and report performance metrics to stakeholders.
Ensure security best practices are followed in cloud infrastructure and application deployments.
Manage cloud and database system maintenance, debugging production issues as they arise.
Improve reliability, quality, and time-to-market of our suite of software solutions.
Partner with security and product teams to define and publish policies, processes, and playbooks to facilitate rapid and effective handling of alerts and incidents.
Lead incident management processes; respond to outages and service disruptions promptly.

Requirements

Bachelor's degree in computer science or similar field.
Five years' experience as a site reliability engineer or similar role.
Strong programming skills (Golang, Ruby, Python, or similar)
Proven ability to diagnose and monitor performance and reliability issues across the stack.
Expertise in Kubernetes.
Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.
Proven experience working with cloud-native infrastructure (Azure Cloud Services, AWS, or GCP).
Experience working with observability and incident management tools (Datadog, OpsGenie, PagerDuty).
Experience scripting operating system tasks with Infrastructure as Code.
Impeccable communication skills.
Ability to problem-solve in a fast-paced, high-stakes environment.
Candidate must be able to submit verification of his/her legal right to work in the United States, without company sponsorship.
Salary: $130,000 - $160,000
About BRG
BRG combines world-leading academic credentials with world-tested business expertise and purpose-built emerging technologies. Our culture centers on agility and connectivity which sets us apart and gets you ahead.
At BRG, we don't just show you what's possible. We're built to help you make it happen.

Benefits

Health insurance

Additional Information

We do Consulting Differently Second Sight Solutions, a subsidiary of Berkeley Research Group (BRG), is a health technology company, and our innovative technology reimagines how drug discount data is exchanged, establishing new connections and improving transparency for drug manufacturers and their customers. Our customers and partners trust us to deliver reliable, first-to-market solutions and safeguard the data we receive. We trust our employees, and our culture gives them the freedom to create, collaborate, and grow. Our leaders are industry experts, creative, unafraid to challenge the status quo, and the pioneers of market-changing solutions. We are seeking a Site Reliability Engineer to design, build, and maintain highly available systems and infrastructure. The SRE will work closely with software developers and operations teams to improve system reliability, automate processes, and minimize downtime.

Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at thinkbrg? Share your experience

Interested in this role?

Apply on the company's website.

Cover Letter Connect