Outbound calling issues in US-West
Incident Report for SimpleVoIP LLC
Postmortem

We apologize for the service disruptions on the 13th and 14th of last week. The issues on the evening of the 13th were not actually confirmed to be affecting multiple customers in time for us to post them on our Status page, in part because so few of our customers were impacted, and we are sorry that those customers who were affected did not get updates on the resolution of that issue in a timely fashion.

These issues were caused by a bug in a recently updated version of our SBC software. This version had already seen widespread adoption on very similar tech platforms without incident. However, with our combination of load and some unique call flow cases, this bug caused a slowdown in SIP message handling. This resulted in the symptoms observed sporadically by many of our customers on the 14th: phones reregistering or failing to register, error messages when attempting to place outbound calls, or calls going straight to voicemail without ringing at the handset first.

We introduced a patch to fix this bug on all of our SBCs on the night of the 14th. Since then, we have been keeping a close eye on things, and we see no further issues.

Posted Mar 19, 2019 - 15:43 PDT

Resolved
At this point all phone traffic has been stable for several hours. We are marking this incident as "Resolved," and we will be providing more detailed information as to the root cause and remediation steps as a follow-up.
Posted Mar 14, 2019 - 23:44 PDT
Monitoring
The issues with the US-West cluster have been resolved. Traffic has been moved back and calls are completing normally. We will continue to monitor this closely to ensure that there are no residual issues. If you continue to experience issues with inbound or outbound call completion or phones losing service, please contact Support.
Posted Mar 14, 2019 - 18:41 PDT
Update
Our engineering teams continue to work toward a permanent fix for this issue. In the meantime, traffic will continue to be handled by the US-East and US-Central clusters.
Posted Mar 14, 2019 - 15:31 PDT
Update
We are still working to resolve the underlying issue with US-West. In the meantime, to prevent overloading the US-Central cluster with all of the added US-West users, the US-West phones are being distributed between East and Central. We will keep you updated on our progress.
Posted Mar 14, 2019 - 11:15 PDT
Update
Traffic has been moved to US-Central and all calls should be completing normally now. We will post another update once we've confirmed a fix for the US-West cluster issues.
Posted Mar 14, 2019 - 10:32 PDT
Identified
The issues in US-West have resurfaced. We will move traffic back to US-Central to restore full operation as we work on the root cause.
Posted Mar 14, 2019 - 10:24 PDT
Monitoring
A fix has been implemented, and we have moved regular traffic back to the US-West cluster. We will continue to monitor this issue closely.
Posted Mar 14, 2019 - 09:40 PDT
Identified
Customers based on our US-West cluster have reported issues placing outbound calls. We are currently moving those customers to the US-Central cluster to restore service while we investigate and repair the underlying issue.
Posted Mar 14, 2019 - 09:34 PDT
This incident affected: SimpleVoIP Hosted PBX.