Handling Downtime and Outages

May 30th, 2016
Posted in Monitoring

We all have experienced outages and downtime and we all know that this has quite a negative impact on our business and reputation. Unfortunately, downtimes and outages cannot be completely avoided, but can be quickly detected, thus minimizing the negative effects. All we need is a very good tool, which will help us identify even the shortest server overloads or network glitches.

First of all you have to be prepared to handle the situation. Knowing the configuration specifics of your website / server / third party hosting panel is crucial if the problem is on your side and can be fixed by you. There are times when you will not have control over the situation as the issue is with a major upstream provider, but being notified immediately about it gives you the chance to contact the provider and let them know that there is a problem. Using our services gives you another option – our Takeover feature. It allows you to leave detailed instructions that our support team will follow in case of a detected failure. You might want us to contact your hosting provider and report the problem, give us credentials to SSH to your server and manually reboot it (if possible), etc.

Click to Enlarge

 

There are times when you need to do some updates or perform other mandatory changes to your website / server, which require you to manually and intentionally put it them in a maintenance mode. This is not considered a downtime and you will most probably not want it recorded as such. We have thought of that and we have the Scheduled Downtime feature for you. You can specify certain periods during which we will not perform any checks, thus keeping consistent and high uptime figures.

Click to Enlarge

It is also very important to have your alerts configured properly. Make sure that you have selected all alert types (Connection related alerts, Content related alerts, Timeout warnings and Recovery messages) to be notified for all potential problems. The Failures before sending alert value is also recommended to be set to 1, so that you can be alerted on the first detected failure. You do not have to worry about false-positives as we provide a second backup location for each of our primary ones. If the primary location detects a problem, the backup one immediately performs a second check and an alert is only sent if there is a confirmation.

We support a wide variety of contact types, which gives you the flexibility to choose the best option for you. The contact types that our system supports are the following:

  • SMS contacts
  • E-mail contacts
  • Voice call contacts 
  • iOS and Android PUSH notification contacts
  • Instant messenger contacts
  • URL contacts
  • SNMP contacts

Click to Enlarge

Keep in mind that outages and downtime are inevitable and you will deal with such a problem eventually, but being prepared can make all the difference. Once the problem is fixed, it is very important to understand what caused the issue and how to prevent it from happening again in the future. We can help by providing very detailed reports, as well as PING and Traceroute information (if enabled) for the specific checks. Content issues can also be captured with some of our more advanced levels of monitoring – Performance, Full-Page and In-Browser ones.

About Damien Jordan

Enjoys life to the fullest. Appreciates all that is beautifully made - quality matters. Cars and photography are the passions filling his spare time. Enjoys going out with friends as this is his way of relaxing.

comments powered by Disqus