Own the operational health of the data platform: availability, performance, resilience, and recoverability (including backup/restore and DR readiness where applicable).
Establish and maintain operational rhythms: health checks, operational dashboards, SLOs/SLIs, alert tuning, and routine maintenance.
Improve "engineering hygiene" across environments (dev/test/prod), ensuring consistent, repeatable deployments and minimal manual steps.
Support, incident management & problem elimination
Act as a primary point of contact for platform support and BAU issues, triaging tickets/incidents and coordinating fixes across Data Engineering and Platform Engineering.
Lead incident response for data-platform issues (including stakeholder comms), run post-incident reviews, and drive preventative actions.
Build and maintain high-quality runbooks and operational documentation.
Contribute hands-on code/config changes where needed to enable platform reliability and developer productivity (for example: deployment automation, environment provisioning, operational tooling).
Maintain technical awareness of new and relevant technologies and recommend improvements where they materially increase reliability, security, or efficiency.
Cost, capacity & vendor/tooling support
Monitor platform usage and costs; identify optimisation opportunities and support capacity planning with the Data Platform Manager.
Support tooling and vendor management activities (evaluations, renewals, usage reporting) with pragmatic operational input.
Knowledge, skills and experience required
Our core tech stack includes Python, Typescript (Node.js and React), AWS, Kubernetes, DataDog, GraphQL, PostgreSQL, Mongo and Kafka.
We're looking for someone who brings:
Strong hands-on production experience operating cloud platforms and data systems (mid-level/senior individual contributor), including incident response and operational ownership.
AWS depth (e.g., building/operating data workloads and services on AWS) and comfort navigating common data-adjacent services; experience operating Glue/Lambda/Step Functions/S3/Athena-style components is a plus.
Solid engineering skills in Python (and comfort reading/modifying services or automation code), plus strong SQL.
Experience with CI/CD and infrastructure automation (e.g., Terraform and build/release tooling), with a bias for repeatability and auditability.
Practical experience with observability (monitoring/alerting/logging/tracing), ideally including DataDog, and a track record of improving signal quality (not just adding alerts).
Working knowledge of containers/Kubernetes operations and the patterns required for safe production change.
Familiarity with data platform building blocks (batch + streaming): orchestration, ETL/ELT, data modelling basics, and streaming concepts (Kafka familiarity is a plus).
Strong documentation and stakeholder skills: able to explain incidents, risks and trade-offs clearly to both technical and non-technical partners.
Comfort working across distributed teams, proactively managing dependencies and keeping work moving without constant oversight.
Requirements
Experien
Benefits
Health insuranceVision insurance
Additional Information
Recruiter for this role:
Andre Braz
As a Data Operations Data Engineer, your purpose is to keep our data platform reliable, secure, observable and cost-effective, and to help it evolve in a controlled, well-governed way. You'll be the hands-on operational counterpart to the Data Platform Manager: owning the day-to-day platform "running" work (support, incident response, release readiness, operational hygiene) and driving pragmatic improvements that reduce routine work and improve developer experience.
This role blends DataOps / Platform Ops / Reliability Engineering for data: you'll sit close to the engineering teams, understand how pipelines and services behave in production, and ensure we have the right tooling, standards, automation and runbooks to operate confidently at scale. This complements the wider engineering assurance and operational excellence expectations across the organisation.