Developing and sustaining advanced services for our Kubernetes-based PaaS (e.g., Coder workspaces, Kueue batching scheduling, Knative serverless, Crossplane control planes)
Manag ing Kubernetes for production on-prem ises and cloud environments (e.g., AWS, Azure) with end-to-end responsibilities of deployment, upgrade, patching, performance tuning, capacity planning , and backups/DR.
Frequent full security patching of all layers of Kubernetes infrastructure while maintaining very high uptime
Ownership and engineering responsibility of production AWS and Kubernetes services
Identify ing and resolv ing full Kubernetes stack engineering problems independently
Ensur ing successful real-time analysis of telemetry data from space launch partners, such as SpaceX, United Launch Alliance (ULA), and Blue Origin
Provid ing after-hours support for Kubernetes infrastructure troubleshooting during launch events
Support ing scientists and engineers running applications in Kubernetes
Provid ing Linux expertise and troubleshooting
Evaluat ing and test ing new products and technologies
Us ing code to enhance and automate operations
What You Need to be Successful
Minimum Requirements for Engineering Specialist :
B achelor's degree in STEM, Computer Science. or other related sciences/engineering discipline.
8 or more years of relevant experience directly related to developing and delivering complex large-scale distributed software systems solutions and technical products
Minimum of 5 years experience supporting highly available enterprise environments, including maintaining system uptime and service availability targets.
At least 2 years of hands-on experience managing existing Kubernetes environments, with responsibilities of deployment, upgrade, patching, and backups
Full ownership and engineering responsibility of production Kubernetes services , both on-premises and Cloud Service Providers such as AWS and Azure
Ability to identify and resolve engineering problems independently
Experience in Linux systems administration, including configuration, for an enterprise environment
Strong understanding of networking and storage fundamentals
Experience automating repetitive tasks with scripting or DevOps tools
This position requires the ability to obtain a TS/SCI security clearance and polygraph, which is issued by the U.S. government. U.S. citizenship is required to obtain a security clearance.
In addition to the above, the minimum requirements for Senior Engineering Specialist include:
12 or more years of relevant experience directly related to developing and delivering complex large-scale distributed software systems solutions and technical products
8 years of experience supporting a hig hly available enterprise environment
Experience architecting and deploying secure
Benefits
Vision insurance
Additional Information
The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded research and development center (FFRDC), we are broadly engaged across all aspects of space- delivering innovative solutions that span satellite, launch, ground, and cyber systems for defense, civil and commercial customers. When you join our team, you'll be part of a special collection of problem solvers, thought leaders, and innovators. Join us and take your place in space.
The Digital Innovation Division (DID) is accountable for integrating strategies, providing governance, and managing internal investments that form the foundation of Aerospace's digital innovation and transformation . The DID Mission IT pillar supports engineering teams across Aerospace by delivering top-tier IT engineering and IT support services tailored to meet the unique needs of our engineering community.
Mission IT Operations is seeking a skilled Site Reliability Engineer with deep expertise in Kubernetes, Linux, programming, and automation. In this role, you will be responsible for developing and maintaining both on-premise s and cloud-based Kubernetes clusters that form the core of an overall Platform as a Service (PaaS) , providing essential support to our engineering team.
As part of a multidisciplinary platform and infrastructure team, you will manage multiple Kubernetes clusters used for technical analyses such space launch telemetry analysis and modeling and simulation, as well as for Artificial Intelligence (AI) Large Language Model (LLM) Training and inference services. Collaborating closely with rocket scientists and engineers, you will contribute to the development of innovative solutions to complex challenges within the space enterprise, supporting critical national space assets. This position requires a strong sense of shared responsibility and ownership, working alongside cross-functional team members to achieve our mission objectives .
Work Model: This is a full-time position based in El Segundo, CA which requires 100% onsite work.