Engineers (SRE) - Observability at Xero

Reliability, Permanent, Melbourne, AU melbourne engineering full-time
Description
Posted a month ago

Xero is a beautiful, easy-to-use platform that helps small businesses and their accounting and bookkeeping advisors grow and thrive. 

At Xero, our purpose is to make life better for people in small business, their advisors, and communities around the world. This purpose sits at the centre of everything we do. We support our people to do the best work of their lives so that they can help small businesses succeed through better tools, information and connections. Because when they succeed they make a difference, and when millions of small businesses are making a difference, the world is a more beautiful place.


About the team

In Site Reliability Engineering (SRE), we drive and influence Xero to provide the most reliable experience for our customers. We are a global team based across New Zealand, Australia and the USA. In SRE at Xero, we combine software and systems engineering to enable engineers across Xero to build and support products that are observable, stable, performant, tolerant to failure, and operate as intended in the face of varying conditions.

We strive to maximise the impact of post incident learning across the organisation to improve the reliability and robustness of the Xero platform, while providing enablement and training across observability, reliability engineering, incident management and service ownership.

We also enable engineers across Xero through developing, supporting and integrating a collection of proprietary and off the shelf tooling to enable incident management and response, incident analysis and learning, monitoring and observability and resource ownership. We surface data and metrics, and provide detailed insights across operational health, production operations and developer productivity.

About the roles

We are currently seeking Engineers within our Observability team in Site Reliability Engineering (SRE). Our team builds and implements sophisticated monitoring and remediation toolsets to support best in class observability, reliability, operational excellence and engineering productivity at Xero. In these roles you will have the opportunity to leverage your technical experience to drive & contribute to team deliverables and also broader SRE and Xero initiatives. 

As a member of our Observability team, you will help enable and empower Xero engineering teams to improve their engineering practices by a combination of the following:

  • Contribute to the delivery of projects aligned with team goals, solving ambiguous problems with innovative solutions.
  • Build and maintain tools that reduce toil in managing our monitoring and logging platforms.
  • Provide guidance around tooling, standards and practices in monitoring, tracing, logging and observability.
  • Support and enable our product teams in troubleshooting issues around observability support systems.
  • Advocate for continuous improvement of systems and processes within the team, and across the organisation
  • Make it easier for engineering teams to achieve a high standard of system awareness and reliability, so they can create more efficient, scalable and reliable applications for Xero's customers.
  • Exposure to on-call duties, including incident management and response, troubleshooting efforts, as well as conducting post-incident reviews and learning from incidents
  • Must have requirements:

  • Experience working to improve operational outcomes for software systems in production environments.
  • Good working knowledge of reliability and observability concepts and practices, knowledge and experience building and/or monitoring with distributed systems and microservices is a plus.
  • Proficiency in one or more object-oriented programming languages like C#, JavaScript, Golang, Python etc.
  • Experience with cloud platforms, Linux, Docker, Kubernetes, IaC, CICD tools.
  • Experience in instrumenting multi-team, distributed web applications and integrating with monitoring solutions like New Relic, Datadog, Dynatrace, SignalFX, Scalyr, Sumo Logic, Splunk or Open Telemetry.
  • The ability to engage and build relationships within the team and with internal stakeholders.
  • Why Xero? 
    Offering very generous paid leave to use however you’d like (plus statutory holidays!), dedicated paid leave to care for your physical and mental wellbeing as well as an Employee Assistance Program to access mental health care for you and your family, health insurance, life insurance, and income protection, wellbeing and sports programmes, employee resource groups, 26 weeks of paid parental leave for primary caregivers, an Employee Share Plan, beautiful offices, flexible working, career development, and many other benefits that reflect our human value, you’ll do the best work of your life at Xero.