Engineering Manager - Site Reliability Engineering at Xero

Reliability, Permanent, Melbourne, AU melbourne engineering full-time
Description
Posted a month ago

Xero is a beautiful, easy-to-use platform that helps small businesses and their accounting and bookkeeping advisors grow and thrive. 

At Xero, our purpose is to make life better for people in small business, their advisors, and communities around the world. This purpose sits at the centre of everything we do. We support our people to do the best work of their lives so that they can help small businesses succeed through better tools, information and connections. Because when they succeed they make a difference, and when millions of small businesses are making a difference, the world is a more beautiful place.


About the team

In Site Reliability Engineering (SRE), we drive and influence Xero to provide the most reliable experience for our customers. We are a global team based across New Zealand, Australia and the USA. We combine software and systems engineering to enable engineers across Xero to build and support products that are observable, stable, performant, tolerant to failure, and operate as intended in the face of varying conditions.

We help teams deliver a great customer experience through a better understanding of the behaviour and operation of their systems. We do this through striving to maximise the impact of post incident learning across the organisation, as well as engaging with teams across the organisation with specialised reliability embedding and enablement, and running SRE workshops and training.

We also enable engineers across Xero through developing, supporting and integrating a collection of proprietary and off the shelf tooling to enable incident management and response, incident analysis and learning, monitoring and observability and resource ownership. We surface data and metrics, and provide detailed insights across operational health, production operations and developer productivity.

About the Role

As an Engineering Manager at Xero, you’ll lead and inspire our SRE tooling teams, with a passion for production operations and the developer experience. You'll be instrumental in driving innovation, fostering a collaborative and inclusive team culture, and ensuring the reliability, scalability, and performance of Xero's products and platforms.

We’re looking for a seasoned technical leader and systems thinker with experience in site reliability engineering or related disciplines such as platform engineering, CICD or internal developer tooling. Additionally, you’ll be familiar with the following SRE concepts:

  • Logging, monitoring and observability, including service level objectives (SLOs)
  • Experience with incident management and response, including critical and high severity incidents
  • Engagement with post incident reviews, incident analysis and learning from incidents
  • Exposure to core concepts of reliability such as capacity management and autoscaling, deployment and release safety, fault tolerance and graceful failure
  • What you’ll do:

  • Provide technical leadership and mentorship to our SRE teams, fostering alignment and collective success.
  • Lead and drive the delivery of cross-organisational reliability initiatives.
  • Collaborate with product teams, software engineers, and a wide group of stakeholders across Xero to define and achieve reliability goals.
  • Through curiosity and thoughtful questioning, you will engage in productive challenge, manage different viewpoints and move critical company priorities forward.
  • Developing a team culture of ownership through role modelling, empowerment, continuous improvement, experimentation and feedback.
  • Analysing complex challenges, facilitating collaborative problem solving and navigating obstacles efficiently.
  • Drive engineering and reliability excellence both within SRE and across Xero.
  • Lifting team and individual performance through coaching and mentoring, goal clarity, feedback and removing barriers.
  • Contribute to the technical and engineering strategy of Xero's reliability initiatives.
  • What you'll bring with you:

  • Previous experience in an Engineering Management role, with proven leadership skills in a technical environment
  • Successful track record of leading delivery of strategically important projects and initiatives
  • Strong technical experience with solid engineering fluency and background
  • Experience working in any of the following areas; site reliability engineering, platform engineering, CICD, internal developer tooling or product engineering
  • Experience leading teams that are operating production systems at scale
  • Strong understanding of the engineer experience and end to end developer toolchain
  • Ability to solve cross-organisation engineering challenges including using influence rather than authority to enact change
  • Strong business acumen and stakeholder management capabilities to support Global reliability responsibility across Xero
  • Understanding of human factors, safety science and resilience engineering
  • Why Xero? 
    Offering very generous paid leave to use however you’d like (plus statutory holidays!), dedicated paid leave to care for your physical and mental wellbeing as well as an Employee Assistance Program to access mental health care for you and your family, health insurance, life insurance, and income protection, wellbeing and sports programmes, employee resource groups, 26 weeks of paid parental leave for primary caregivers, an Employee Share Plan, beautiful offices, flexible working, career development, and many other benefits that reflect our human value, you’ll do the best work of your life at Xero.