Site Reliability Engineering Manager – Reliability Operations & Hygiene. Lead the major incident, IT change, problem management, and operational hygiene functions, applying SRE thinking to improve reliability, reduce toil, strengthen automation, and increase resilience across critical services.
Day‑to‑day Tasks
- Provide day‑to‑day leadership for the Reliability Operations & Hygiene team (engineers/analysts and vendor resources), including workload prioritisation, coaching, quality of execution, and removal of blockers across incident, change, problem, and hygiene activities.
- Lead and coordinate high‑severity incidents when required, providing clear incident command, managing escalation and stakeholder communications, coordinating technical recovery, and ensuring post‑incident reviews produce actionable, owned follow‑ups.
- Own the effectiveness of IT change and problem management disciplines by improving change risk assessment, reducing ch...