Skip to main content
Back to jobs

Sr. Software Dev Engineer, AWS OpenSearch Service

External
Full-timeOn-siteToday
PythonJavaGoRustAWSMachine Learning
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Requirements

  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience as a mentor, tech lead or leading an engineering team
  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build pr

Additional Information

Imagine running search and analytics at any scale without thinking about clusters, capacity, or version upgrades. Where your queries return in milliseconds whether you're indexing a thousand documents or a hundred billion. Where the platform scales, heals, and secures itself so your team can focus on the questions, not the infrastructure. That's what we build at Amazon OpenSearch Service. Amazon OpenSearch Service is the fully managed and Serverless AWS service that lets customers deploy, operate, and scale OpenSearch for log analytics, full-text search, application monitoring, observability, and AI-powered retrieval. We serve hundreds of thousands of customers running mission-critical workloads across every AWS region, and we are part of the AWS Database and Analytics organization. Come join the OpenSearch Control Plane team. We are responsible for a service that: - Reliably manages a large fleet of cloud-native OpenSearch clusters and collections, freeing customers from sizing, scaling, and patching decisions - Guarantees high availability and durability for mission-critical search and observability workloads at the scale of the most demanding internet businesses - Orchestrates and automates the complete lifecycle of an OpenSearch cluster or collection - from creation through scale-up, scale-out, upgrade, replication, and fail-over The OpenSearch control plane is not just any distributed system. It orchestrates a fleet of clusters across every AWS region, detects and recovers from node failures in seconds, and serves workloads that span petabytes of customer data. Most recently, we launched **OpenSearch Serverless NextGen** - delivering ultra-fast collection provisioning and true scale-to-zero capacity, so customers pay only for what they use and get a working environment in seconds rather than minutes. We are not done; this is one of many investments raising the bar for what customers can expect from a managed search service. OpenSearch itself is a community-driven, Apache 2.0-licensed open-source search and analytics suite built on Apache Lucene. Since launching in July 2021, the project has delivered multiple major and minor releases advancing core search, vector search, and analytics. Our next phase focuses on transforming OpenSearch into an intelligent retrieval engine - one that meets the evolving needs of customers running search, analytics, and AI-powered workloads across structured, unstructured, and multi-modal data. We work in Java, Rust, Python, and Go. We contribute upstream to OpenSearch and Apache Lucene. We are customer-obsessed, operate with the entrepreneurial pace of a startup inside one of the world's largest cloud providers, and value strong intuition backed by metrics. **Operations is core engineering on this team, not overhead.** Engineers operate what they build, and we treat customer-impacting incidents with the same engineering rigor we apply to feature work. Reliability and Static Stability is a forcing function for design, not a phase after launch. Key job responsibilities - Drive architectural decisions in a large-scale distributed system - multi-region orchestration workflows, multi-tenant control surfaces, lifecycle automation, scale-to-zero, regional failure isolation, and graceful degradation under partial outage - Invent and Simplify in equal measure - drive simplification of legacy designs as actively as you propose new ones; identify and eliminate accidental complexity across the team's surfaces; mentor others on simplifying-by-design - Treat operations as core engineering - own the systems you build end-to-end, lead incident response and RCAs with customer-first urgency, and drive operational improvements that compound across the team - Lead design and development of complex services across cluster management, capacity orchestration, scaling policies, query routing, security, and machine learning - Build and apply AI fluency across the software development lifecycle - decompose problems for AI assistance, ground prompts in relevant context, validate AI-generated output, and use AI to accelerate both code and beyond-code work - Mentor and grow engineers across the team; raise the bar for software development, operational excellence, and security - Set technical direction and partner with the broader OpenSearch organization to align solutions with customer needs and market opportunities - Contribute upstream to OpenSearch and Apache Lucene


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Amazon Development Center U.S., Inc.? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect