Senior Site Reliability Engineer responsible for defining SLO frameworks, shaping observability architecture, and leading post-incident reviews to ensure the reliability of a telehealth platform. This role requires 6+ years of experience in SRE, DevOps, or backend roles with production ownership. The ideal candidate will have strong cloud and infrastructure-as-code experience, hands-on experience with SLOs, SLIs, and error budgets, and solid observability experience.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
About Dispensed
At Dispensed, we are passionate about empowering individuals to reach their full potential by supporting better health outcomes. We believe that access to innovative and alternative therapies can transform lives. Through our telehealth platform, we facilitate patient access to affordable, efficient, and reliable alternative medicine services across Australia, NZ and the UK.
About The Role
Dispensed delivers prescriptions and clinical consultations to patients across Australia, New Zealand, and the UK, and the reliability of that platform is not an abstract engineering concern: when it degrades, patients lose access to healthcare. As a Senior SRE, you will own the operational health of a platform in active transition, consolidating a legacy Django system into a modern Next.js and Supabase architecture on AWS, and you will have genuine influence over how reliability is designed into that new foundation from the start. This is a role where you will define SLO frameworks, shape observability architecture, and lead the kind of post-incident work that produces lasting systemic change rather than tactical patches. If you want to work at the intersection of serious engineering craft and meaningful patient outcomes, and to build practices that a growing team will rely on for years, this is that role.
What You'll Own
- Define and maintain SLO and error budget frameworks across multiple services, working directly with product engineers to make reliability expectations concrete and actionable rather than aspirational.
- Design and evolve the observability architecture across the platform, ensuring the engineering team has genuine insight into system behaviour during the Django-to-Next.js migration and beyond.
- Identify systemic gaps in monitoring, alerting, and incident response before they surface as patient-facing incidents, and drive the work required to close them.
- Lead post-incident reviews that go beyond immediate fixes, producing changes to architecture, runbooks, on-call processes, or delivery practices that reduce the likelihood and impact of recurrence.
- Write infrastructure-as-code and automation that sets a quality bar for the team, reviewing infrastructure contributions from product engineers and junior SREs with direct, specific feedback.
- Keep product engineering teams unblocked on reliability concerns by being a visible, proactive partner in delivery: attending design conversations, raising reliability risks early, and pushing back constructively when decisions create patient risk without a conscious trade-off.
- Improve how the team operates on reliability over time, including on-call processes, reliability review checkpoints in the delivery cycle, and the quality of documentation product engineers use to understand what is expected of their services.
Interested in remote work opportunities in Development & Programming? Discover Development & Programming Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
What You’ll Need
- 6+ years in SRE, DevOps, or backend roles with production ownership.
- Experience operating and improving reliability of distributed, customer-facing systems.
- Strong cloud and infrastructure-as-code experience (AWS, Terraform, or similar).
- Hands-on experience with SLOs, SLIs, and error budgets.
- Solid observability experience (metrics, logging, tracing).
- Experience leading incidents and post-incident reviews that drive systemic change.
- Strong scripting/programming skills (e.g. Python, Go, TypeScript).
- Ability to identify risks early and influence cross-team engineering decisions.
- Clear communication and documentation skills.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
- Experience supporting system migrations or major architectural changes.
- Experience in regulated or high-availability environments.
- Experience improving on-call practices or mentoring engineers.
- Work From Anywhere in Australia. 🌍
- A competitive salary and awesome benefits package. 💰
- A supportive and positive work environment. 🌟
- Opportunities to grow and develop your career. 📈
- Opportunity to transform lives through alternative medicine. 💡
Similar Jobs
Explore other opportunities that match your interests
Budgetly
Jobgether