Disaster Recovery

By: Engagent  06-Dec-2011

Disaster Recovery

Bare Metal Disaster Recovery:

Bare Metal Recovery (BMR) is the process of taking a low-level snapshot of a machine's operating system partition and storing it where it can be quickly and easily accessed when required. A BMR solution has two parts.

  1. The first is a program that is set up to periodically snapshot an OS partition using image backup technology. This is installed as a service and comes with a scheduler. The scheduler is then programmed to take backups of the live machine without any requirement to shut down services, close applications, or go offline. Image backups are normally stored to a UNC path, SAN, or NAS device for online storage and quick access when needed.
  2. The second part of a BMR solution is the process used to boot a dead machine. This enables users to connect to the online location where the image backups have been stored and initiate a restore. Once the OS partition has been restored (which can take between 5 to 30 minutes), the only remaining steps necessary to complete the disaster recovery are to remove the boot media and reboot the machine. This latter phase takes approximately two minutes before the machine is back to the exact state at which the image backup was performed.

Bare Metal Recovery (BMR) is most often considered a supplemental layer of protection that can help insulate an organization against unnecessary downtime. While file-by-file backup and restore software is excellent at protecting against data loss, there is an inherent disadvantage in being able to quickly return an unbootable machine to a fully operational state. The shortcomings are the many steps required to perform a file-by-file recovery, and the lack of guarantee that every operating system (OS) change has been reinstated even after a restore. Thus, BMR should actually be a vital part of any company's disaster recovery plan, not just an afterthought.

Buying and implementing a BMR solution has become a priority for many organizations – and it should be. BMR is a key part of any formal disaster recovery plan. It not only offers a fast means of restoring a failed server, but also offers extraordinary benefits to expedite recovering from a catastrophic event. With the ability to recover to dissimilar hardware and/or virtual environments, organizations can provide a clear path to recovering lost servers by taking off-site backups to any number of service companies who can provide temporary equipment. Rather than attempt to locate exact hardware matches or conduct laborious file restores to new equipment, users can restore an image of a Dell server to an HP or IBM server. Using the right BMR solution, companies also have the ability to restore multiple physical servers to a VMware ESX host machine, and be up and running in literally minutes.
With the technology available today, it is no longer acceptable to have a file-by-file backup solution as the only means of protecting data. Whether an organization has a single server, or over a thousand, a bare metal recovery solution is a necessary preventative measure against expensive and unnecessary downtime. BMR should be an integral part of every disaster recovery plan.

While the definition (and monetary value) of a timely recovery of a failed machine can vary from organization to organization, one unarguable fact is that downtime costs money. Actual system downtime loss is an expense that is usually not well perceived in most organizations – it can even vary by the time of day. Downtime for Company A might cost $5,000 an hour while the cost for Company B could be $100,000 an hour. Even the rate between individual servers within a company can be vastly different depending on the critical nature of the applications being run. Here is a very simple formula to estimate downtime:
(Employee costs per hour)$45 x 50= $2250(labor)+$7000=$9250
(Fraction of employees affected by outage + Average income per hour) x
(Fraction of income affected by outage)
= Estimated average cost of one hour of downtime
*A Simple Way to Estimate the Cost of Downtime – David A. Patterson, Computer Science Division, UC Berkeley
Downtime costs fall into two broad categories: tangible and intangible. Calculating tangible costs such as employee wages, operating costs, and office expenses are straight forward and can be estimated with great accuracy using a simple formula like the one provided above. The difficultly lies in factoring all of the potential intangible costs such as lowered employee morale, missed opportunities, forgone sales, and loss of customer goodwill. These are hard to assign accurate costs.

The bottom line is all companies recognize computer downtime means lost money. Regrettably, most don't realize how much it truly costs.

A Money Saver:

Every minute of machine downtime costs an organization time and money. Therefore, everyone should be able to agree that limiting downtime is highly desirable, particularly if it is reasonably affordable.

To demonstrate the return on investment (ROI), here is a BMR scenario:
If the national average for Windows server downtime is $15,000 an hour (and this is a fairly modest sum), then this would mean that every minute of downtime equals $250. If it then takes a standard bare metal disaster recovery solution approximately 20 minutes, as opposed to 40 minutes using file-by-file backup and restore, the 20 minute savings using the BMR solution equates to a $5,000 dollar savings in downtime cost with its first use.
Expanding on this, if the price of a premium BMR solution is $1,000 per server, an organization could subtract the price of the BMR software from the money they saved on restore times.

Bottom line, the company would still be left with a $4,000 cost savings. Not many products offer a ROI like this, particularly after just a first time use. In a real production environment, the time savings is more like a 6-to-1 ratio, leading to even greater savings as opposed to the 2-to-1 ratio used in this example.

To give organizations a better understanding of how the two backup methods differ, we have provided a procedure comparison between using file-based backups and restores versus image-based backups and restores.

  1. File-by-file restore example:
    1. Install EISA Partition (53 minutes)
    2. Install Windows OS (45 minutes)
    3. Install Backup Software (5 minutes)
    4. Create Data Partitions (10 minutes)
    5. Restore System 4GB drive (35 minutes)
    6. Restore System State/Registry (1 hour)
    7. Reboot Server (2 minutes)
    Total Restore Steps = 7
    Restore Time = 3 ½ hours
  2. Bare Metal Recovery example using UltraBac Software's UBDR Gold:
    1. Boot server using UBDR Gold Restore Media (5 min)
    2. Connect to a UNC path and initiate a 10GB OS partition restore with a conservative 2GB/minute transfer rate (8 minutes)
    3. Reboot Server (2 min)
    Total Restore Steps = 3
    Restore Time = 15 minutes
As the example demonstrates, a BMR solution can easily restore a failed machine's 10GB OS partition in 15 minutes using a conservative 2GB/minute restore speed on a Gigabit network connection. Fast systems can experience over 5GB/minute restore speed. Organizations using the BMR process now "complain" that the machine boot time takes longer than the physical restore. When comparing file-by-file methods with BMR, there simply is no comparison.

The information in this article was current at 02 Dec 2011


Other products and services from Engagent

06-Dec-2011

LAN Licenser

Cost Center Reports Include: Detailed Product Use Detailed Product Use Product Denials Top Ten Computers In A Selected Cost CenterTop Ten Cost Centers For A Selected ComputerTop Ten Cost Centers For A Selected ProductTop Ten Products In A Selected Cost CenterTop Ten Users In A Selected Cost CenterTotal Use of Product Total Use of Product.


06-Dec-2011

Software License Management

As recommended by the Software & Information Industry Association, and the Business Software Alliance, , Engagent Software License Manager allows you to track, ?names, serial numbers, version numbers, number of copies or users permitted by the license, and the computers on which the copies are installed..