Skip to main content
Back to jobs

Senior Site Reliability Engineer

External
latitude logoLatitude · Pittsburgh, Canada
$179K–$269K/yrFull-timeOn-site1mo ago
AWSCloudFormationElasticsearchGCPIncident ResponseKubernetes
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Responsibilities

  • Build monitoring to ensure our platform is healthy and its reliability measurable
  • Build alerting and a set of runbooks to enable faster detection and remediation of platform issues
  • Debug complex issues that may combine multiple components of the stack and ensure proper fixes are implemented to prevent these issues from happening again
  • Participate in an on-call rotation and culture of continuous improvement through blameless postmortems
  • Design and implement components of the platform to enable features that make the work of our customers possible, simpler and more efficient
  • Build Kubernetes controllers to automate operations
  • What you'll need to succeed:
  • Bachelor's degree in Computer Engineering, Computer Science, Electrical Engineering, Robotics or a related field and 4+ years of relevant experience (or Master's degree and 2+ years of relevant experience, or PhD)
  • Fundamental understanding of Linux operating system internals, TCP/IP networking, and storage subsystems
  • Hands on development in Go or Python to create robust software that can run reliably in production
  • Strong experience scaling and securing services in the cloud (AWS, GCP) or cloud native environments
  • Experience using infrastructure-as-code principles to automate the creation of infrastructure resources (e.g. Terraform, CloudFormation)
  • Experience authoring and maintaining Kubernetes Controllers in Go
  • Experience running Kubernetes and related core components in a large-scale, production environment
  • Experience with metrics (e.g. Prometheus), logging (e.g. Elasticsearch, Loki) and tracing (e.g. Jaeger, Tempo) systems
  • Understanding of engineering design limitations and ability to provide guidance to teams to scale their services to achieve desired performance within budget
  • A focus on increasing service reliability through defining and adhering to SLOs
  • Strong communication skills and the ability to work effectively in a diverse and distributed team
  • What we offer you:
  • Competitive compensation packages
  • High-quality individual and family medical, dental, and vision insurance
  • Health savings account with available employer match
  • Employer-matched 401(k) retirement plan with immediate vesting
  • Employer-paid group term life insurance and the option to elect voluntary life insurance
  • Paid parental leave
  • Paid medical leave
  • Unlimited vacation
  • 15 paid holidays
  • Daily lunches, snacks, and beverages available in all office locations
  • Pre-tax spending accounts for healthcare and dependent care expenses
  • Pre-tax commuter benefits
  • Monthly wellness stipend
  • Adoption/Surrogacy support program
  • Backup child and elder care program
  • Professional development reimbursement
  • Employee assistance program
  • Discounted programs that include legal services, identity theft protection, pet insurance, and more
  • Company and team bonding outlets: employee resource groups, quarterly team activity stipend, and wellness initiatives
  • Learn more about Latitude's team, mission and career opportunities at lat.ai !
  • Candidates for positions with Latitude AI must be legally authorized to work in the United States on a permanent ba

Benefits

Health insuranceDental insuranceVision insurance401(k)Paid time offEquity / stock optionsPerformance bonusParental leave

Additional Information

Latitude AI ( lat.ai ) develops automated driving technologies, including L3, for Ford vehicles at scale. We're driven by the opportunity to reimagine what it's like to drive and make travel safer, less stressful, and more enjoyable for everyone. When you join the Latitude team, you'll work alongside leading experts across machine learning and robotics, cloud platforms, mapping, sensors and compute systems, test operations, systems and safety engineering - all dedicated to making a real, positive impact on the driving experience for millions of people. As a Ford Motor Company subsidiary, we operate independently to develop automated driving technology at the speed of a technology startup. Latitude is headquartered in Pittsburgh with engineering centers in Dearborn, Mich., and Palo Alto, Calif. Meet the team: As a Site Reliability Engineer on the team, you will be responsible for helping to build and run these mission critical systems. Through the implementation of monitoring and automation, you will constantly ensure the health, reliability, scalability, and performance of the platforms. The Site Reliability team interacts with engineering teams including ingest/data processing, mapping, labeling, triage, machine learning (detection, prediction, tracking), motion planning/control, offline simulation, and release/deployment teams to provide uniform service observability and incident response.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at latitude? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect