Manager, Infrastructure Engineering
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
We're looking for an Engineering Manager to lead our Infrastructure team, the group responsible for Upside's core cloud platform, networking, security, and foundational services. This team provides the compute, storage, networking, observability, and CI/CD capabilities that power every application and data product at Upside. You'll combine deep infrastructure experience with people-first leadership to drive reliability, security, and scalability across our platforms, while creating an environment where engineers can do the best work of their careers. As an engineering leader at Upside, you gather the right information from those around you and take thoughtful risks so teams can ship high-quality, reliable software quickly. In this role, you will: Lead with platform and company outcomes over local optimization. You lean in wherever critical infrastructure risk or opportunity exists, regardless of org boundaries, and prioritize the success of the broader platform and product ecosystem. Create a safe, collaborative team environment. You name problems, invite open discussion, and help your team learn from incidents and change without blame. Raise the bar. You identify and influence improvements in reliability, performance, security, and engineering standards across services and environments. Provide clarity and empower autonomy. You ensure expectations, SLAs, and guardrails are clear so teams can move quickly and independently on a stable platform. In addition to the leadership responsibilities above, you will: Ensure our AWS environments are secure, cost-efficient, and production-ready, with strong foundations in IAM, networking, secrets management, encryption, and compliance-aware controls. Lead the evolution of our CI/CD, observability, and platform tooling,making it easier and safer for product and data teams to ship, monitor, and operate services independently, and improving overall developer experience. Drive operational excellence across reliability, scalability, and performance, including capacity planning, incident management, on-call health, and post-incident learning culture. Define and track key platform health and cost metrics, such as availability, latency, error budgets, infrastructure cost per transaction, and change failure rates, using these insights to guide prioritization and continuous improvement. Partner closely with Security, Data Platform, and Product Engineering teams to ensure infrastructure decisions enable reliable experimentation, user-centric product development, and scalable data and ML systems. Oversee large-scale programs that span multiple teams, such as cloud modernization, multi-region or cell-based architecture, and foundational security or resilience upgrades. Represent Infrastructure in cross-functional planning and technical forums, translating platform risks and opportunities into clear business tradeoffs and decisions. Why You Should Apply You aren't afraid to challenge the status quo when it makes the team and business better. You learn from those around you while utilizing data to advocate for informed change. You thrive at the intersection of systems and storytelling, not only building robust solutions but also communicating their purpose, impact and rationale, so teams can experiment, iterate, and act confidently. You care about building resilient systems that scale. You bring a mindset of continuous improvement, and know when to invest in observability, automation, or new infrastructure to reduce toil and improve outcomes for the team and end users. You believe that pulling quality upstream starts with engineering. You champion best practices, encourage early testing and validation, and work closely with peers to build a culture of quality from the ground up. Ideal Qualifications Have 3+ years of experience managing infrastructure, SRE, or platform engineering teams, and at least 3+ years of hands-on experience building and operating production systems in the cloud. Are comfortable designing and reviewing solutions in AWS, including networking, container orchestration, storage, and IAM, and can make pragmatic tradeoffs between performance, reliability, and cost. Are eager to integrate generative AI into engineering workflows to improve delivery speed, quality, and dev