At SafetyCulture, we’re solving problems that fundamentally change the way people work and positively impact the world.
We’re looking for talented people to help us build a world class engineering team. A recent funding round gives us a AU$1.3Bn valuation, and the funds will be used to evolve the product into a communication and collaboration platform. We’re innovating and expanding into sensors and IoT, and telematics for fleets. We’re facing interesting technical challenges as we scale, and have an ambitious goal to have 100 million workers using our products every day.
The Role As a Site Reliability Engineer at SafetyCulture, you’ll help to design, build and run resilient systems. You live and die by Murphy’s Law, knowing that anything that can go wrong will go wrong at the worst possible moment. You will help to foster a culture of designing for, and expecting failure in production systems - a culture where learning and knowledge-sharing is expected.
You love to solve sticky cross-service and cross-domain problems, and have a passion to identify root causes in complex scenarios. You understand how important it is for the teams to analyze past incidents. Most importantly you are a team-player, are excited about the prospect of working in a fast-paced demanding environment and get that learning happens at the edge of the comfort zone.
How you can have an impact As one of a core team of experienced SREs, you will shape and mature the culture, define the processes that the development teams will follow, and allow the business to scale to millions of users. You’ll coach and educate your engineering colleagues on systems reliability and fault-tolerance best practice, identify gaps in existing systems and come up with remediation plans. You’ll improve metrics such as MTTR and MTTF, and promote a culture of sustainable incident response and blameless post-mortem. We encourage involvement in the community, open source work, attending talks and events, and experimenting with new technologies.