Developing and supporting critical monitoring systems.
Helping external teams instrument and improve observability
Maintaining a large knowledge base and how-to guides for consumers of our platform
Helping to set a direction for the company's monitoring strategy and reducing the operational burden for our team and company.
Influencing the team's road map and quarterly plans, balancing both platform needs, requests from users, and long term company goals
Requirements
Must be a US Citizen as the team has responsibilities in FedRAMP environments.
Bachelors + 12 years of related experience, or Masters + 10 years of related experience, or PhD + 8 years of related experience.
A mix of experience in Software Engineering, Site Reliability Engineering, and Observability.
Experience supporting customers in a SaaS environment.
Excellent communication skills and a history of collaborating effectively with cross-functional teams.
Experience in crafting and delivering software as a service and working with cloud infrastructure services such as AWS EC2, S3, Kubernetes, etc.
Experience working with multiple cloud providers (like AWS, GCP, and Azure). Implemented multiple types of observability (logs, metrics, events, telemetry) for large-scale software deployments.
Experience with some mix of Linux, Docker, Kubernetes, Golang, Python, Terraform, Prometheus, Splunk, DevOps or SRE concepts, Timeseries databases, Distributed computing paradigms.
Experience supporting large-scale, distributed, or business-critical systems.
Experience as a technical leader at a team or project level, including guiding design decisions and driving execution.
Experience working within developer platforms or infrastructure platform teams.
This is a remote role, but if you happen to live near a Splunk / Cisco office, you're welcome to work from the office as often as desired.
Why Cisco?
We are Cisco, and our power starts with you.
Message to applicants applying to work in the U.S. and/or Canada:
The starting salary range posted for this position is $174,700.00 to $253,400.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation*, equity, or benefits.
U.S. employees are offered benefits, subject to Cisco's plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and l
Benefits
Dental insuranceVision insurance401(k)Remote work optionsEquity / stock optionsParental leave
Additional Information
The application window is expected to close on: 06/30/2026 Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received .
This position is fully remote and can be performed from any location within the United States. The role requires reliable internet connection and the ability to work independently in a remote environment.
Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Come help organizations be their best, while you reach new heights with a team that has your back.
Meet the Team
The team is responsible for the platform that Splunk engineers use to implement metrics & tracing telemetry in both their customer facing and internal services. The platform is a suite of microservices running in the cloud. The team requires a combination of Software Development and Site Reliability Engineering experience.
We're looking for an accomplished engineer to bring their experience and expertise in Observability, Software Development, and Site Reliability Engineering to evolve how Splunk developers observe their services.