Senior Cloud Operations Engineer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
The Data Insights Capability Team manages an end-to-end ETL pipeline that bridges the gap from on-premise Operational Technology and Industrial Control Systems to our advanced AWS cloud environment. The pipeline processes mission-critical data, feeding into cloud databases, data visualization apps, and GenAI-powered analytics. As the Senior Cloud Operations Engineer , you will play a critical role in the operational excellence of our hybrid deployments. You will oversee edge and cloud operations as part of the Data Insights team to ensure our platform remains resilient, secure, and ready for the mission. What Success Looks Like: 30 Days: Successfully onboard, gain a deep understanding of our hybrid edge-to-cloud architecture, and take over routine patching and maintenance of the existing Kubernetes clusters. 60 Days: Enhance our GitHub Actions and CloudFormation pipelines with improved automation, while configuring and standing up the infrastructure for your first new customer facility deployment. 90 Days: Establish and instrument our core pipeline Service Level Objectives in Splunk, proactively identifying and optimizing data latency or reliability bottlenecks across the hybrid ecosystem.
Responsibilities
- Maintain, optimize, and scale our hybrid Kubernetes (RKE2) edge clusters to ensure maximum uptime, smooth local data extraction, and seamless scaling across customer facilities.
- Remediate and manage container security and base image vulnerabilities to harden our edge-to-cloud infrastructure against security threats and maintain strict compliance standards.
- Define, implement, and monitor pipeline Service Level Indicators and Service Level Objectives within Splunk to proactively identify bottlenecks and guarantee the health, freshness, and availability of our hybrid data streams.
- Integrate and automate testing, security scanning, and deployment stages into GitHub Actions to increase deployment velocity, eliminate manual errors, and shorten feedback loops.
- Architect, build, and deploy robust AWS CloudFormation stacks to provide highly available, repeatable, and scalable cloud database and data ingestion infrastructure.
- Lead and execute the technical configuration and validation of on-premise infrastructure and logical replication pipelines to accelerate the onboarding timeline and ensure zero data loss when standing up new customer facilities.
- What You Bring
- Required Qualifications:
Requirements
- 5+ years of experience in a DevOps, SRE, or Systems Engineering role managing production-grade infrastructure.
- Technical Skills:
- Strong experience managing Kubernetes environments (bonus points for edge/on-prem distributions like RKE2 or Rancher) and configuration management using Kustomize.
- Deep familiarity with AWS services, specifically infrastructure provisioning via CloudFormation and managing data/compute resources (EC2, Lambda, RDS, DynamoDB, Firehose).
- Hands-on experience configuring log aggregation and metrics dashboards using Splunk, Cloudwatch, or similar..
- Practical experience with container security scanning, patching, and maintaining a secure software supply chain.
- Proficiency in Python, NodeJS, or Ansible to write automation scripts or troubleshoot pipeline friction.
- Understanding of networking concepts required to support hybrid environments (e.g., local APIs connecting to OT systems, data replication to the cloud).
- Domain Knowledge:
- Demonstrated understanding of managing hybrid edge-to-cloud architectures, with a strong understanding of how to securely automate and monitor data pipelines that bridge physical Operational Technology (OT) facilities with scalable AWS cloud environments
- Travel: Ability to travel for a full week quarterly for team planning and client Program Management Reviews.
- Must be a U.S. Citizen and able to obtain a DoD NIPR
Benefits
Additional Information
Simplesense builds, deploys, and sustains the Installation Resilience Platform that enables mission operators to rapidly adapt and respond. The Platform protects critical infrastructure from cyber attack while unlocking previously siloed information to monitor, diagnose, and improve response times to incidents. Our adversaries rapidly adopt the latest technology: we help defense users respond in kind. Simplesense is a non-traditional defense contractor and prime on the Air Force's Installation Resilience Operations Command and Control (IROC) program, which is now expanding to five additional Air Force, Space Force, and Army installations from the one prototype installation, Tyndall Air Force Base. Our team combines over 100 years of direct mission experience solving hard problems with 50 years technical expertise deploying DevSecOps, cybersecurity, and cloud infrastructure, giving us a deep appreciation for our customers' mission and end users' priorities. We build for scale, architecting and prioritizing technical work for long term sustainability. Senior Cloud Operations Engineer Location: Denver, CO
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at simplesense? Share your experience