Skip to main content
Back to jobs

Staff Production Engineer

External
Canva logoCanva · Melbourne, Australia
Full-timeOn-site3d ago
Core DataJavaLessLinuxObservability
Cover LetterConnect

Prepare for this interview

Elite

AI-generated questions, company research, and talking points tailored to this role


Requirements

  • Production at scale: Owned reliability work within large-scale distributed systems. When things broke, you wrote the fix, not the ticket.
  • Embedded delivery: Previously worked as an engineer embedded in or partnering closely with a product or feature team, not siloed in a platform org that throws tools over the fence.
  • JVM or systems depth: You've built real things in Java, Go, Rust, C++, or a comparable systems language at production scale; commercial depth, not academic familiarity. We're language-flexible for the right engineer, but you need to show up and win the technical duel in the first meeting
  • Distributed systems in practice: Navigated sharding, replication, failure modes, consistency tradeoffs in real systems.
  • Debugging large codebases: Ability to parachute into an unfamiliar codebase, orient quickly, find where the problem actually lives, and fix it.
  • Influence without authority: Proven to have made things better in systems through wisdom and trust.
  • Technical knowledge
  • Networking Depth: You know the network stack and what traffic looks like a scale.
  • Linux internals: Enough kernel-level understanding to reason about what's actually happening when a system misbehaves process scheduling, memory, I/O, network stack.
  • Distributed systems patterns: Consistent hashing, leader election, consensus, backpressure, circuit breakers.
  • Observability tooling: You've instrumented systems for real, built the

Benefits

Remote work optionsFlexible schedule

Additional Information

Join the team redefining how the world experiences design. Hey, g'day, , kia ora, 你好, hallo, vítejte! Thanks for stopping by. We know job hunting can be a little time consuming and you're probably keen to find out what's on offer, so we'll get straight to the point. Where and how you can work Collingwood is home to our Melbourne campus - a vibrant, creative hub for connection and impactful work. While Sydney is home to our HQ, Melbourne brings its own unique vibe, with local artwork, lush greenery, and thoughtfully designed spaces to help you collaborate, focus, and feel part of a welcoming community. This role is based in Melbourne, and we're looking for someone who calls it home. Our hybrid way of working gives you the flexibility to work remotely, and to come together on campus for meaningful in-person collaboration and connection when it matters most. What you'd be doing in this role The Production Engineering team sits at the intersection of software engineering and the hardest reliability problems in Canva's infrastructure. Writing software that changes how production behaves at 240M MAUs and growing. The strategic bet is a different model entirely. Canva's own take on what production reliability looks like, built for how we work. Senior software engineers embedded long-term in the areas that carry the most technical risk, working shoulder to shoulder with product teams, close enough to the roadmap to shape how it lands in production before the problems compound. Not operationalising systems. Not running alerts. Writing software that changes how production behaves. The engineers who do this work well have gone deep in systems most people only operate. They can walk into a codebase they didn't write, understand what's actually happening at scale, win the technical respect of the team they're embedded with, and then improve the software to make it more reliable, more efficient, and more resilient. At the moment, this role is focused on: Owning an engagement area: Taking long-term accountability for one of Canva's highest-risk technical domains, sharding core data stores, resource utilisation, distributed systems challenges at scale embedded alongside the team that owns it. Writing production software: The work is code, not process. Instrumenting, refactoring, rebuilding the pieces that cause problems at scale. You're a software engineer first; the reliability outcome at scale is what you're optimising for. Collaboration: Opportunity to pair, mentor and learn from fellow production engineers Customer First: Striving for fewer incidents, faster recovery, lower severity, latency that bends in the right direction. Taking pride in moving needle metrics, that positively impacts the quality of the customer experience. Platform contributions: Where you see a pattern that needs a shared capability, you bring it back, not to own it indefinitely, but to seed the platform work that scales beyond your engagement. Compounding at the system layer: One engineer who truly understands a system can change how every other engineer builds on top of it. That's the leverage in this role and why the archetype matters more than the domain. What success looks like: As a secondee, developing trusted relationship with your team. Guiding them towards shipping at velocity, with more confidence and less toil. You're probably a match We'd love to hear from you if you fit one or more of these. You don't need to meet all of them, but the more the better and if you join the team, we're invested in helping you grow.


Your Match

How well this role fits your profile.

Company Intel

What employees say

Worked at Canva? Share your experience

Interested in this role?

Apply on the company's website.

Cover LetterConnect