Senior Software Engineer - Site Reliability
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
About the role
The Delivery Engineering team for Workday Data Platform is a team of engineers for engineers. We support a wide range of groups - backend, frontend, platform, performance and test engineers, and many more. We spend our time writing software and managing how we deliver, deploy, monitor, and everything in between. We are responsible for the infrastructure behind how we build, support, test, and ultimately deploy Prism. We are now building the next-generation Data Platform and Data Lake that powers Workday Data Platform. Our Data Platform is built on open standards - Apache Iceberg for the table format, Apache Polaris for catalog and governance, and Trino for distributed, interactive query. We are a multi-region team with a diverse set of skills, and this is an exciting area of growth at Workday. You are a Senior Software Engineer that is focused on building reliable scalable systems, software, and processes. You understand the importance of CI and software development lifecycles and the role it plays in delivering software to customers. You dislike doing things twice, so you automate each step along the way. You want to make other engineers efficient and simplify how things are run. Your Role You have experience in designing, analyzing, and troubleshooting large-scale distributed systems build on technologies like Spark, YARN, Hadoop, Kubernetes, Polaris, Iceberg, Trino You love to work in Unix/Linux from kernel to shell, file systems, client-server protocols, etc. You have a strong coding background and can utilize various languages. We focus and build tooling and automation using Python, GoLang and Java. You prefer building infrastructure and tooling in the cloud and using managed services where possible, we focus on AWS and GCP You package and deliver immutable services and functions, utilizing Docker, Kubernetes and Serverless frameworks (AWS Lambda, API Gateway) You believe that everything should be repeatable and use orchestration and deployment tools and Infrastructure as Code such as terraform, Ansible You rely on CI/CD to automatically deliver build pipelines such as Jenkins, TeamCity, Bamboo, Artifactory You utilize and help create meaningful metrics and alerts using technologies like: Prometheus, Grafana You've worked with JVM's and have debugged and tuned them in the past About You Basic Qualification 8+ years experience in software development engineering, architecting, building, and scaling robust and efficient software systems. 5+ years coding experience and can utilize various languages (We focus and build tooling and automation using Python, GoLang and Java.) Bachelor's degree in Computer Science, Engineering, or related discipline, or equivalent practical experience. MS in Computer Science or related field and 3 years relevant experience or BS in Computer Science or related field and 5 years relevant experience Other Qualifications: Experience in designing, analyzing, and troubleshooting large-scale distributed systems built on technologies like Spark, YARN, Hadoop, Kubernetes, Apache Polaris, Apache Iceberg, Trino Experience building infrastructure and tooling in the cloud and using managed services where possible, we focus on AWS Working knowledge of building immutable services and functions utilizing Docker, Kubernetes and Serverless frameworks (AWS Lambda, API Gateway) Working knowledge of building Highly Available, Scalable, Reliable multi-tenanted big data applications on Cloud (AWS, GCP) and/or Data Center architectures. Workday Pay Transparency Statement (For EU Locations Only) Listed below is the base salary range applicable to this position. Workday pay ranges (and the precise pay offered to the successful candidate) are based on a number of objective criteria such as relevant experie