SRE Developer
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Responsibilities
- Own the reliability, availability, and performance of production systems in a containerized, microservices-based environment
- Monitor system health using Grafana dashboards, alerts, and observability tools; proactively identify and resolve issues
- Manage and operate Kubernetes clusters (via Rancher), including deployments, scaling, and troubleshooting
- Lead and participate in incident management using OpsGenie, including on-call rotations, escalations, and post-incident reviews
- Troubleshoot issues across application, infrastructure, messaging, database, and container layers
- Build and maintain automation scripts and tools using Bash, Go, and/or Python to improve operational efficiency
- Support and optimize CI/CD pipelines using GitLab, ensuring smooth deployment and release processes
- Collaborate with development teams to improve application reliability, performance, and observability
- Work with databases and data systems (MySQL, Redis) for performance monitoring and issue resolution
- Support distributed messaging systems such as Kafka and RabbitMQ
- Contribute to and maintain operational documentation, runbooks, and knowledge bases using Jira and Confluence
- Perform root cause analysis (RCA) and implement preventative measures
- Ensure systems operate in alignment with security, compliance, and data privacy standards
- Leverage AI-powered engineering tools to accelerate troubleshooting, documentation, and workflows
- What you need to be successful:
Requirements
- 3+ years of experience in Site Reliability Engineering, DevOps, Production Support, or Systems Engineering
- Bachelor's degree in computer science or related field
- Hands-on experience with Grafana, Kubernetes and Docker
- Experience with OpsGenie for incident management and on-call coordination
- Strong experience with GitLab/Git, including CI/CD pipelines and release processes
- Proficiency with Atlassian tools (Jira, Confluence) for tracking and documentation
- Solid knowledge of MySQL - Experience with Kafka and/or RabbitMQ
- Familiarity with Redis for caching and performance optimization
- Working knowledge of Temporal or similar workflow orchestration tools
- Strong scripting skills in Bash
- Proficiency in Go and/or Python for automation and tooling
- Familiarity with PHP applications (Symfony, Laravel) for production support
- Proven ability to troubleshoot complex systems across multiple layers
- Excellent documentation habits (runbooks, playbooks, system diagrams)
- Knowledge of FTC data protection principles
- Understanding of NIST frameworks and security best practices Familiarity with GDPR requirements (data handling, logging, retention, privacy)
- As an equal opportunity employer, we celebrate diversity and are committed to creating an inclusive environment for all employees
- In this role you may be exposed to adult content
Benefits
Additional Information
Established in 2004, we are a tech pioneer offering world-class adult entertainment and games on some of the internet's safest and most popular platforms. With the support of an international team of dynamic and collaborative innovators, we are on a mission to enable safe user experiences and empower our communities by celebrating diversity, inclusion, and expression - all while maintaining robust trust-and-safety protocols. We embrace the best of both worlds! Local talent can thrive in our collaborative office space with the flexibility of a hybrid work environment, while remote team members play an integral role in shaping our dynamic culture from afar. We have offices in Montreal (Quebec), Austin (Texas) and Nicosia (Cyprus). *A select number of positions require full-time in office attendance* We are seeking a highly skilled Site Reliability Engineer (SRE) to support and enhance the reliability, scalability, and performance of our production systems. In this role, you will play a key role in incident response, root cause analysis, and continuous improvement of operational processes while leveraging cutting-edge tooling and AI-assisted solutions.
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at aylo? Share your experience