Backing Up in Today's Virtual Server Environment

Virtualization has changed the IT landscape: it has allowed companies to consolidate servers, save on power and cooling, and free the OS and Application from the underlying hardware. While these have been of great value and provided companies with cost savings, virtualization has also created some challenges. One of those is Backup and Recovery – why is this?

Would not backup and recovery work the same in a virtual environment as it does in a physical environment? Well, the answer is yes and no. To understand this, let’s first understand why Backup and Recovery is still critical today:

  1. Many companies need to keep data off-site and retain it for a defined period of time. This, many times, is due to government regulation.
  2. Hardware failures happen, thus data needs to be restored – although this can be reduced by using a High Availability architecture.
  3. Data can get corrupted. While sometimes the data corruption can be addressed, without going to a backup it may be necessary to restore the data.
  4. Data gets deleted by end-users and needs to be restored.
  5. Production data is needed (after being scrubbed) for QA systems to ensure that new software version will work correctly with the data.

With the proliferation of virtualization in today’s environment we may have 10+ systems running on one physical server. This has a number of impacts that one needs to take into account on the Backup side of the Backup and Recovery equation.
First off, we need to make sure that we have enough CPU, Memory and Network bandwidth for the amount of data attached to the system, including:

  • The total amount of data on all the virtual systems;
  • The total amount of data that may be on that system at any one time. For example, if High Availability clustering is involved, one may have 10 virtual systems today, but tomorrow there may be 12 or 14 virtual systems. It is important to understand what the maximum amount of data could be (not what it is now);
  • The growth rate on those virtualized systems; and,
  • The daily change rate.

This will, in turn, allow one to determine if a backup window can be met, should additional systems come on-line and/or over time as the amount of data grows. Part of this backup window calculation is Network bandwidth. Most backups are done over a Network, although some backups are done over a SAN. Today one gigabit per second networks (1gbs) have become the standard for backup networks and these are dedicated to backups. With virtualization, one needs to make sure that the total amount of data that “could be on the system”, due to additional virtual machines or data growth as well as the amount of time allocated to the backup window, can be supported by the backup network being used. Oftentimes, the backup network is overlooked, yet it is one of the most critical components in ensuring that the backups are completed within the backup window.

While backups are important, if one cannot recover data or recover within the stated Recovery Time Objective (RTO), then the reason for doing backups loses a lot of its value. It is recovery which is key to keeping the business running when something happens. With virtualization, one must be able to either restore the entire virtual machine or simply a file or set of files.  Today’s backup solutions either conduct a single pass or multiple backup passes during backups to allow administrators to restore the full virtual machine or just a file. Virtualization companies like VMware and Microsoft have provided an API for the backup vendors to use. The API provides a conduit to the data store. VMware’s API is vStorage and Microsoft’s is Volume Shadow Services (VSS).  Backup vendors are using these API’s to directly access the data store. Let’s take an example where we are using VMware and Symantec NetBackup. We would load the Symantec Enterprise Agent on the ESX server. The Symantec Enterprise Agent would be accessed care of the VMware vStorage API and would take a snapshot of the data stores, so a backup could be performed. Data would be sent to the Backup Proxy Server, which could be running on the Symantec Media Server (as long as it is a Windows server) which in turn would backup to the storage device (Disk, VTL or Tape). The following diagram illustrates the example above:

Virtual Server Backup

While the API’s provide a conduit for backup vendors to integrate their products into the virtual server, the API’s also eliminate the need to have a backup agent in each one of the guest operating systems. What about a case where one wants to have a backup agent interact directly with an application (such as Oracle, SQL or SAP) using the specialty agent(s) that many backup vendors have developed? In that case, one would have to utilize the backup agent along with the specialty agent for said application and/or database in the guest OS where the application or database is running. In this case, the agent would communicate to the backup server and the API, such as vStorage, would not be used. Please note that most people are using the API’s today.

Technologies like de-duplication are also invaluable as more and more data reside on more and more virtual servers on a single physical box. Using de-duplication with virtual servers can reduce the amount of data needing to be backed up, as the backup software can determine if some data has already been backed up. Let’s look at an example where we have 10 Windows Operating Systems. After the initial backup, only the unique part of each guest operating system would have to be backed up; thus saving time on the backup, amount of data to be backed up and network bandwidth. The combination of virtualization and de-duplication can have an enormous benefit in not only ensuring that backups are completed in the backup window, but also that the backup network will be able to support the amount of data being backed up (less data needs to be backed up, thus less network bandwidth needed).

In summary, is it important to understand how much data “could be” on one’s physical server at any given point in time. While virtualization allows multiple systems to run on the same server, the underlying hardware is now shared among the multiple operating systems; therefore, one has to keep the hardware capacity (be it CPU, Memory or Network) in mind when determining how many guest operating systems and how much data to put on a server. De-duplication can help a great deal in ensuring that, as data grows, the backups will be able to be completed in the backup windows and that the network will not get saturated. Following a solid capacity model will help ensure that not only backups get completed, but also that the recovery can take place when needed.

Patrick B.
En Pointe Professional Services
En Pointe Technologies

AttachmentSize
backup.jpg15.78 KB