Call Completion
Incident Report for SimpleVoIP LLC
Postmortem

We received reports of inbound and outbound call failures as well as errors and slow response times within our web portal on Wednesday, June 14. These reports spanned all three of our availability zones.  Our engineering team was able to determine that our US-West application servers, which handle web traffic and interface with our SIP devices, were having difficulty reaching their preferred database server and had entered a bad state. This limited the amount of traffic they could process. Due to the trouble connecting to this database server, we had some difficulty in restarting its services or removing it from our global datasource pool, which delayed recovery of our services. Once we fixed connectivity to the database server, we were able to restart the affected services on our application servers and restore voice and web traffic to 100%. 

This is not the first time we have encountered application state issues due to a faulty connection to our US-West database servers. In fact, we have been working to migrate our database servers in US-West to new hardware that will also minimize the number of network hops to our application servers. That migration is scheduled for Monday, June 26th, and you will be receiving maintenance window notifications shortly. 

We are also implementing additional monitoring to allow us to catch similar issues as they occur so that our engineers may be engaged before our Support team starts to receive customer calls. 

Thank you for your patience as we continue to improve our systems and processes to best serve you and your customers. As always, we welcome any feedback, which may be submitted to our Support team or to your Account Manager.

Posted Jun 20, 2023 - 15:48 PDT

Resolved
All services remain stable following the fix implemented last night. We will be scheduling a maintenance operation to address the underlying root cause as soon as possible.
Posted Jun 15, 2023 - 06:24 PDT
Monitoring
A fix has been implemented, and our monitoring system is showing normal call traffic at this time. We will continue to keep a close eye on this to ensure everything remains stable.
Posted Jun 14, 2023 - 19:52 PDT
Update
Our engineering team is still working to resolve the intermittent connectivity issues affecting inbound and outbound calls.
Posted Jun 14, 2023 - 19:43 PDT
Identified
Our engineering team has identified a connectivity issue between some of our servers. This is causing intermittent inbound and outbound call failures across all 3 of our availability zones. We are working to resolve this as quickly as possible.
Posted Jun 14, 2023 - 18:10 PDT
Update
We are continuing to investigate inbound and outbound call failure.
Posted Jun 14, 2023 - 17:35 PDT
Investigating
We are investigating reports of some calls not completing on our system. We will provide updates as soon as we have identified the cause and impact of this event.
Posted Jun 14, 2023 - 17:16 PDT
This incident affected: SimpleVoIP Hosted PBX.