After 25 days of no EC2 issues there have now been 3 failures in the last 24 hours including one 10 minutes ago. There has always been at least one of the sip1 or sip2 up at all times during this latest batch of outages so people with SIP clients that support SRV records should not have noticed any issues.
When an outage occurs I get an email and SMS alert and if I’m in the vicinity will reboot the inaccessible EC2 instance to restore it to service, generally that process can take about 30 minutes. If I’m not in the vicinity it will obviously be longer.
For anyone that wants to track the actions I take to recover from failures I’ve created a sipsorcery twitter account where I will post messages relating to the service status.