Site icon Search Engine People Blog

Best Practices for handling Website Server Outages and Downtime

Website Server Outages and Downtime

As per Wikipedia.org: The term downtime is used to refer to periods when a system is unavailable. Downtime or outage duration refers to a period of time that a system fails to provide or perform its primary function.

This is usually a result of the system failing to function because of an unplanned event, or because of routine maintenance. Unplanned downtime may be the result of a software bug, human error, equipment failure, malfunction, high bit error rate, power failure, overload due to exceeding the channel capacity, a cascading failure, etc.

Inevitably, web sites will suffer outages and downtime based on common events, including:

As per W3.org (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html)

10.5.4 503 Service Unavailable

The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. If no Retry-After is given, the client SHOULD handle the response as it would for a 500 response.

Note: The existence of the 503 status code does not imply that a server must use it when becoming overloaded. Some servers may wish to simply refuse the connection.

Best Practices for handling server outages: