I was intrigued by an article on database virtualization that caught my eye early this morning and I wanted to find out what is it all about?
The business driver for database virtualization is the globalized economy where business transaction happen 24 x7 x 365 and business critical data must be available within the network boundary of a corporation, or through the Internet, spanning application downtime and IT maintenance windows.
Data virtualization is defined here as
to view data from disparate sources without knowing or caring where the data actually resides.
Data virtualization obviously leads to database virtualization, which is defined here as
the use of multiple instances of a DBMS, or different DBMS platforms, simultaneously and in a transparent fashion regardless of their physical location
James Kobielus, a Senior Abalyst with Forrester Research is predicting that real time information needs will drive database virtualization
the database as we know it is disappearing into a virtualization fabric of its own. In this emerging paradigm, data will not physically reside anywhere in particular. Instead, it will be transparently persisted, in a growing range of physical and logical formats, to an abstract, seamless grid of interconnected memory and disk resources; and delivered with subsecond delay to consuming applications.
He is making an interesting case that
Real-time is the most exciting new frontier in business intelligence, and virtualization will facilitate low-latency analytics more powerfully than traditional approaches. Database virtualization will enable real-time business intelligence through a policy-driven, latency-agile, distributed-caching memory grid that permeates an infrastructure at all levels.
As this new approach takes hold, it will provide a convergence architecture for diverse approaches to real-time business intelligence, such as trickle-feed extract transform load (ETL), changed-data capture (CDC), event-stream processing and data federation. Traditionally deployed as stovepipe infrastructures, these approaches will become alternative integration patterns in a virtualized information fabric for real-time business intelligence.
The convergence of real-time business-intelligence approaches onto a unified, in-memory, distributed-caching infrastructure may take more than a decade to come to fruition because of the immaturity of the technology; lack of multivendor standards; and spotty, fragmented implementation of its enabling technologies among today’s business-intelligence and data-warehouse vendors. However, all signs point to its inevitability.
Oracle acquired Tangosol in May 2007 and possesses a well-developed in-memory, distributed-caching technology called Coherence
Microsoft annonuced Project Velocity a year later in June 2008:
a distributed cache that allows any type of data (CLR object, XML document, or binary data) to be cached. “Velocity” fuses large numbers of cache nodes in a cluster into a single unified cache and provides transparent access to cache items from any client connected to the cluster.
xkoto was selling GRIDSCALE as a database load balancer in 2006. However, it is very smartly capitalizing on virtualization being a hot segment and has repositioned GRIDSCALE as a database virtualization product. This is a vaiid repositioning not only in the context of the definitions cited above but more importantly due to its validation by noted industry analysts:
- Robin Bloor, a database industry pundit and blogger, who describes its database virtualization capabilities
- Dan Kusnetzky, another influential industry analyst, author and blogger, whose review appears here
Scaleout Software also has a distributed cache offering