Today I have been reminded about the importance of having no single point of failure in your systems.
With news that one hosting company providing both dedicated servers and Virtual Private Servers (VPS) has now been offline for 5 days, you need to consider what would happen to your systems if you were solely reliant on a company that also had such an outage.
HostV are not the only people to suffer from outages with several major UK datacentres going offline for shorter periods at some time of another over the past 10 years, it is only prudent to ensure you do not rely on any single company or site for all your hosting.
Commonly called disaster recovery, you plan and prepare for the worst. Ensuring that if it happens, it is no longer a disaster but a just an inconvenience to implement your prepared plan.
To take a worst case scenario, if terrorists blew up a datacentre with the complete irrecoverable loss of all hardware and data on site, could you keep going?
Most importantly take offsite backups and test them. Offsite backups can at least enable you to restore your data to a new server if all else fails. Just make sure that you have regularly tested that your backups are usable and that they contain all the data that you think they do. There is nothing worse than finding out your backups were corrupted or did not contain some vital bit of data at a time you need to use them due to a failure.
Ensure that you have a backup or disaster recovery server in place and online at a different facility preferably in a different country so no single fibre or backbone outage can affect both. This server does not need to be as powerful or as highly redundant as your main servers, it just needs to be able to carry on critical functions during an outage of your main live systems.
Keep the disaster recovery server synchronised with your main live server, you can use systems like rsync and database replication to ensure file and databases are maintained in sync and ready to go at a moments notice.
Ensure that you can always change your DNS entries. It is no use being able to fall over to backup systems if you cannot change your DNS and move the services over. So make sure that your primary systems are not also your master DNS server. That way you can update the DNS during an outage to point at your backup system.
Make sure that you can access all your backup systems without having to access them through the live system. I have previously seen a design that “for security” would only allow access to the backup systems through the main live servers. When asked how they would do this during an outage of the main servers, there was a fast reassessment of security considerations.
If you plan for the worst and hope for the best then at least you will be prepared should the worst happen.
Feel free to have a look at our test domain which offers some funky stuff we care about and some of the latest furniture news
