shareVM- Share insights about using VM's

Simplify the use of virtualization in everyday life

Oracle, SQL Server, performance and manageability in a VM

leave a comment »

I have been reading about SQL performance issues within the VM and have come to the realization that the performance of SQL queries on any database,  Oracle, SQL Server, mySQL, etc. running in a VM can never match the performance of the same DB running on the native host OS. I have seen questions about the performance issue in several Google searches, particularly for SQL Server within VMWare over the past couple of years. I would like to share my insight  about the root causes.

SQL Query Optimizer Support

The database’s cost-based optimizer creates and evaluates several  query execution plans and chooses the one that meets the optimization goal, e.g., fastest response time for returning the first row, or least overall time to compute the entire result, etc. at the the lowest cost. The cost is based on several parameters that govern the query plan execution environment. For example, the query optimizer can examine parameters such as the number of processors on the system, the processor type, the amount of physical memory, the amount of swap space available, the disk throughput in I/O per second, CPU load factors, etc.  The query optimizers of the current generation DB’s – Oracle 10g, Oracle 11i, SQL Server 2005, etc are not designed for computing query plans for the VM environment,  they have a single system image for the VM and do not understand that the underlying hardware resources will actually be shared by several VM’s.

VMWare seems to have begin affording a view of the underlying host system for the query optimizer’s benefit, although, at the time of writing I have not researched what Microsoft and Xen offer.  It is important for you to understand the tuning knobs VMWare and HyperV have to offer for deploying Oracle and SQL Server,  Xen is interesting for me because it is the underlying virtualization platform for Amazon’s ec2, which I am using currently. However, such a view is necessary but not sufficient, because it cannot support  the possibility of running large queries in parallel  (parallel query execution) over  multiple, distributed, databases.

Lower Transaction Throughput

But of course, the pathlengths are longer.   Let me state at the outset that I am willing to learn from experts who write such code for a living and are interested in providing clarity in depth. My goal here is to paraphrase and simplify to get the point across:

Imagine that the database has a relational storage manager that provides methods to open/close tables and set-oriented API’s that iterate over a set of rows for reading or updating them individually, or inserting a new row. The storage manager encapsulates a file system driver for managing reads from and writes to its proprietary data store. Imagine further that the file system driver is a kernel-mode component that  performs the low-level I/O on NTFS or Linux, as the case may be.  The nominal read/write call sequences and pathlengths  are optimized so that the databases perform well for routine use as well as for competitive TPC benchmarks. I am illustrating a notional  call stack below:

SQL application
|
DB query run time (user mode)
|
DB storage manager (user mode)
|
DB file system driver (kernel mode)
|
Host OS (kernel mode)
|
Disk driver

Now let’s see what happens when we we run the DB  within the VM.

SQL application
|
DB query run time (user mode)
|
DB storage manager (user mode)
|
DB file system driver (kernel mode)
|
Guest OS (kernel mode)  <= additional I/O path length
|
Hypervisor (kernel mode) <= additional path length
|

Host OS (kernel mode)
|
Disk driver

In the VM, the OS kernel-mode I/O path is traversed more than once (I am sure the optimizations that VMWare and Microsoft are furiously working on to reduce it from 2x to somewhere in the range of 1.6x – 1.8 x including the overhead of the hypervisor code).

Manageability overhead

The database manages its own proprietary storage structures that differ substantially from the files supported by the native OS. For example, most databases allocate large extents (think, collection of disk blocks) which contain several pages (think, disk blocks), and rows within pages. The management complexity lies in the fact that extents can contain pages of different sizes, e.g., 64K, 32K, 4K, 512 bytes, a set of pages may be linked in a list, some pages may be data pages, some only index pages, whereas some others may contain both data and index rows.  The rows in each page may either be of fixed length or variable lengths. Index pages split or combine based on the fill factors, rows get forwarded when they are updated but cannot fit within the current page, there are shadow rows that get deleted asynchronously after a transaction commits. The point is that the database buries a lot of complexity in dealing with storage efficiently in its proprietary internal representation.

All databases grow. The current best practice that I see on the Web is to pre-allocate the largest storage size you expect your DB to row to. Think about what will happen if your DB is sharing that storage with another app and outgrows it fater than anticipated.

OS/third party tools for de-fragmentation and compression are pretty much useless, or are marginally useful at best. You must use the administration utilities, e.g.,  SQL Server stored procedures, for de-fragmenting the database and effecting space recovery – the OS utilities for de-fragmentation and compression are not effective. Remember that any I/O operations performed within the VM suffer the additional kernel I/O pathlengths of the guest and host OS together with that for the hypervisor, as discussed in the earlier section.

From the manageability perspective, none of the databases except Tandem’s, can perform an online reorganization of data. SQL Server, and to the best of my knowledge Oracle too, requires you to bring the database offline, unload and reload the data in order to remove de-fragmentation. This is a real bummer.

Epilogue

Ultimately it is all about planning. Do your capacity planning and then validate it in the lab to some scale with the database residing in a VM. You should read the manuals and ensure that you truly understand and can justify the performance and manageability trade off for moving the DB to a VM.

Written by paule1s

December 7, 2008 at 10:26 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: