Staff Infrastructure Reliability Engineer - Database & Storage
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
You will help lead the database and storage strategy at Redfin, including architecture, management, and access patterns. You will lead complex technical discussions with a variety of audiences, including software and systems engineers and business leaders. You will architect & lead implementation of cloud database and storage systems with a focus on reliability, observability, scalability, and security. You will support large scale / high volume databases both as self-managed and specialized AWS managed offerings, including management activities, such as upgrade, backup, recovery, and migration. You will use and evangelize approved AI code generation tools to document, architect, and create code. You will plan and participate in high availability and disaster recovery planning/drills. You will lead incident resolution, including performing root causes analyses. You will use your systems knowledge to promote scaling and performance for services across Redfin and some partner companies. You will participate in an on-call rotation for about one week per month. About You 7+ years of experience managing systems in AWS or a similar cloud environment, including compute and storage with an emphasis on solution development and execution. 5+ years of experience with at least one, but preferably more, of the following: PostgreSQL or similar RDBMS; AWS Aurora/RDS; AWS S3; Elasticache; Opensearch; DynamoDB. You have a proven history in architecting, building, scaling, and supporting cloud infrastructure technologies, specializing in database and storage services and can communicate the direct business impact of this work. You have extensive experience with Linux administration and Linux scripting, including Python script development. You are an experienced mentor of other engineers with the ability to guide a team of engineers to identify and implement solutions to difficult problems. You're committed to best practices that set your team up for long-term success, including infrastructure as code, configuration management tooling, and security practices. Deep knowledge and professional use of at least one AI code generation tool, such as Anthropic Claude Code, GitHub CoPilot, Cursor, or similar to implement key efficiencies for cloud infrastructure. You have excellent communication skills that allow you to connect and influence your immediate team up through senior leadership You understand and can implement core reliability principles, including monitoring, alerting, and incident management. What you'll get Our team members fuel our strategy, innovation and growth, so we ensure the health and well-being of not just you, but your family, too! We go above and beyond to give you the support you need on an individual level and offer all sorts of ways to help you live your best life. We are proud to offer eligible team members perks and health benefits that will help you have peace of mind. Simply put: We've got your back. Check out our full list of Benefits and Perks . Redfin is revolutionizing the $75 billion real estate industry. We use data, beautiful software, and innovative design to put customers first at every step in the home-buying and selling process. Get ready to dive headfirst into our award-winning website and mobile apps, solving complex business problems in a highly visible, customer-centric way. If you value doing great work in a collaborative environment, join our team! This job description is an outline of the primary responsibilities of this position and may be modified at the discretion of the company at any time. Decisions related to employment are not based on race, color, religion, national origin, sex, physical or mental disability, sexual orientation, gender identity or expression, age, military or veteran status or any other charact