Automated deployments and configurations, in *principle*, are the sort of mitigations for some of this that I had in mind, to aid rollback or redeployment or failover if an upgrade gets wedged.
But I'm no SRE so a lot of this, alas, remains theoretical to me.