Best Practices for Virtualizing Your Oracle Database – Datastores

First off, my apologies for delaying the last part of this four part blog for so long.  I have been building a fully automated application platform as a service product for EMC IT to allow us to deploy entire infrastructure stacks in minutes – all fully wired, protected and monitored, but that topic is for another blog.

In my last post, Best Practices For Virtualizing Your Oracle Database With VMware, the best practices were all about the virtual machine itself.  This post will focus on VMware’s virtual storage layer, called a datastore.  A datastore is storage mapped to the physical ESX servers that a VM’s luns, or disks, are provisioned onto.   This is a critical component of any virtual database deployment as it is where the database files reside.  It is also a silent killer of performance because there are no metrics that will tell you that you have a problem, just unexplained high IO latencies.


A Few Facts

Before we get to the best practices, it is important to cover a few fundamental facts that will influence our decisions.

Most storage arrays will perform best with sequential workloads. The exception to this are the newer all Flash arrays.  There are even several EMC arrays that perform exceedingly well with sequential workloads.  The reason, they excel at sequential workloads, is due to a feature called pre-fetch.  Pre-fetch attempts to get the data from disk before the database even asks for it, by detecting a multi-block read event.  The Oracle database has been adding in this capability recently and, as Oracle’s pre-fetch technology matures, may help all storage arrays perform better.

Many of the newer arrays support thin provisioning. Thin provisioning is a great new feature that can save large amounts of disk space in the storage array.  Virtual thin provisioning however, can easily introduce fragmentation, since the underlying storage array is unaware that the virtual layer is providing this service.

With vSphere 5.5, and earlier versions, there is a limitation of 256 luns per ESX server.  This number is important as it is part of the equation when deciding how to layout your virtual luns.  With vSphere you have three choices when creating a virtual lun, or VMDK (Virtual Machine Disk):

  1. Thin – Storage is only allocated when it is used.
  2. Zero Thick – Storage is only allocated when it is used, but total space is reserved on the datastore.  Works with storage array thin pools.
  3. Eager Zero Thick – Storage is fully allocated when virtual lun is created.
  4. Raw Device Mapping (RDM) – Directly maps the virtual lun to a SAN device.

Avoid Fragmentation

Virtual luns, or VMDKs, are created on top of an ESX datastore.  The ESX datastore is simply a lun (disk drive or mount point) attached to a storage array and files on this are presented to VMs as a virtual lun.  This datastore lun is then shared between all ESX servers.  Fragmentation occurs when multiple virtual luns or VMDKs are provisioned on the same datastore.  Since the default VMDK provisioning is zero thick, the underlying storage on the storage array is only provisioned when something is written to the VMDK.  Since there are many VMDK’s all writing to the datastore each one will be granted small chuncks interleaved with the other small chunks given to other VMDKs.

Consider eight VMDKs on a single datastore, each one is configured as zero thick and represented as a different color.  Your datastore would look like below, essentially making all IO random.

Oracle

Options

There are many ways to avoid this fragmentation.

  • Provision a lun as zero thick on a dedicated datastore.  This provides the ability to retain thin provisioning, as no other VMDKs can insert blocks between each growth of this VMDK.  This counts against the 256 lun limitation.
  • Provision a lun as eager zero thick.  This disables thin provisioning as the entire size of the lun is provisioned immediately, however many VMDKs can be placed on the same datastore, reducing the impact to the 256 lun limitation.
  • Provision a lun as raw device mapping.  This has the same effect as a dedicated datastore.  It preserves thin provisioning but counts against the 256 lun limitation.  It also allows for native storage array replication, but removes the ability of Site Recovery Manager (SRM) to be used for disaster recovery.
  • Thin luns should never be used with thin provisioning on the storage array as it would make capacity planning impossible.

Recommendations

Keep in mind that development and test environments likely do not require the same level of performance as production, so many of these recommendations may not apply to those environments.

  1. Redo log luns should always be provisioned eager zero thick, with a minimum of 2 luns.
    • Very little extra storage used and redo logs require good performance.
    • Minimum of 2 luns for throughput – This allows for utilizing of more IO buffers on both the operating system and the storage array.
  2. Datafile luns should be provisioned as either eager zero thick or eager thick on a datastore that is dedicated to that virtual lun, also with a minimum of 2 luns, 4 when using ASM.
    • Dedicated luns are more storage friendly, since they utilize thin provisioning, but use significantly more luns in the ESX cluster (256 lun limitation).
    • Replicated luns for DR will require some level of dedicated datastores so each database can failover by itself.
  3. Archive luns should always be provisioned as eager thick.
    • Archive logs are created at once and therefore generate their own full provisioning.
    • Archive logs are created and backed up, assuming RMAN, in parallel anyway.

Summary

One of the most important factors to good database performance is good storage performance.  Following these guidelines, and the recommendations from my previous posts, will ensure that your virtual infrastructure will not be a performance bottleneck.  I have pointed out, throughout this series, recommendations that can be relaxed in non-production environments, to help your virtual infrastructure perform more efficiently.  There is definitely a balancing act in play, but I always recommend tempering performance against efficiency to achieve a proper balance while achieving a great user experience for my customers.  If you are looking for more information, or a more holistic approach, I have recently updated our whitepaper called EMC IT’s Virtual Oracle Deployment Framework.

 

About the Author: Darryl Smith