The i-Technology Media!
Register | Log in
   
 
.NET  ·  AJAX  ·  CLOUD  ·  ECLIPSE  ·  FLEX  ·  OPEN WEB  ·  iPHONE  ·  JAVA  ·  LINUX  ·  OPEN SOURCE  ·  ORACLE  ·  PBDJ  ·  SEARCH  ·  SILVERLIGHT  ·  SOA  ·  VIRTUALIZATION  ·  WEB 2.0  ·  WIRELESS  ·  XML
Comments
Google Wave Invitation Giveaway
By Aditya Banerjee
Timo Hirvonen wrote: I would really appreciate an invitation. Been desperately trying to find one :) timo [dot] hirvonen [at] gmail [dot]com
Nov. 27, 2009 11:13 AM EST
Cloud Expo on Google News
Did you read today's front page stories & breaking news?


2009 East
PLATINUM SPONSORS:
IBM
Smarter Business Solutions Through Dynamic Infrastructure
IBM
Smarter Insights: How the CIO Becomes a Hero Again
Microsoft
Windows Azure
GOLD SPONSORS:
Appsense
Why VDI?
CA
Maximizing the Business Value of Virtualization in Enterprise and Cloud Computing Environments
ExactTarget
Messaging in the Cloud - Email, SMS and Voice
Freedom OSS
Stairway to the Cloud
Sun
Sun's Incubation Platform: Helping Startups Serve the Enterprise
POWER PANELS:
Cloud Computing & Enterprise IT: Cost & Operational Benefits
How and Why is a Flexible IT Infrastructure the Key To the Future?
Click For 2008 West
Event Webcasts

2009 East
GOLD SPONSORS:
CA
Get Your Transactions Under Control: SOA Performance Management
Software AG
Performance Driven Adoption: The Secret to Advancing SOA
Intel
The Evolving SOA Appliance: 3 Game-Changing Innovations
SILVER SPONSOR:
Denodo
Data Mashups: Deliver Your Project Faster with Virtualized Data Services Across Internal & External Sources
POWER PANELS:
The Business Value of Service Orientation
Driving Profitability Through User Experience
Click For 2008 West
Event Webcasts
Live Google News by SYS-CON!
Top Three Links You Must Click On


Features
High-Performance Data Services with Smart Caching
Reduce potential performance bottlenecks and ensure timely delivery of information

By: Avtandil Garakanidze
Sep. 2, 2009 03:30 PM

One of the main concerns among IT architects planning an implementation of an enterprise data virtualization layer in their service-oriented architecture (SOA) or overall information system is the performance of the participating data services. Performance becomes particularly important in real- or near-real-time environments as well as in environments with highly distributed data sources where network latency cannot be controlled. This article examines how to reduce potential performance bottlenecks by utilizing high-performance caching with data virtualization middleware. Different scenarios within single-, cluster- and distributed-caching implementations are covered.

Introduction
A data virtualization implementation normally includes a wide variety of data sources, both relational and non-relational, often distributed across several business units, and sometimes located in different geographical regions. Therefore, the data virtualization layer's performance highly depends on response latency. As the amount of data retrieved through the data virtualization layer increases, network latency can quickly turn into a bottleneck for the entire IT department overseeing the implementation. If the implementation includes various client applications distributed across multiple business units or geographies, the data virtualization layer implementation has to include response latency as one of the major SLA items.

In its most simple implementation, the data virtualization layer's response latency comprises three factors: the data sources, the middleware layer, and the network. Hence, when the data sources have non-uniform performance characteristics, located on networks with varying throughput capacity, the total request/response latency can be measured as follows: (see Figure 1):

Total response latency = MAX(DSL1, DSL2, ... DSLN) + MAX(DSS1, DSS2... DSSN) + MAX(DSC1, DSC2...DSCN) + DVL, where

DSL - Data Source Latency|
DSS - Data Source Network Subnet Latency
DSC - Client Network Subnet Latency
DVL - Data Virtualization Latency

In deployments where the data virtualization middleware, all data sources and all clients are located on the same subnets, network latency for the data sources and clients will be similar and more or less constant. Therefore, to reduce the response latency of the entire solution, architects should concentrate on reducing the latency of the slowest data source to minimize the amount of time the data virtualization middleware idles, waiting for a response. While changing the data source or partitioning it are both valid options, a less invasive approach is to utilize a high-performance caching system in the data virtualization layer. There is a wide variety of implementation choices available in the data virtualization layer, starting with a simple table-level caching, more advanced materialized-view caching, and the most complex dynamic result-set caching. Using these options, IT architecture teams can achieve a performance boost of their data virtualization deployments, ranging from 10 percent to 50 percent (see Figure 2).

The effectiveness of caching in the data virtualization layer depends on a number of additional solution characteristics, including the frequency of changes in the underlying data and the applications' tolerance for "stale" data (e.g., the frequency with which the cache system has to refresh itself to comply with the required SLA). It is easy to imagine a case when caching, if not implemented properly, actually slows down the overall solution performance. This can happen if the underlying source data changes frequently and the client application requires access to the real-time data. In this case, most application requests will result in a cache miss, and therefore will either initiate a pass-through request or cache refresh. The additional time it takes the request to travel to the cache system and back will actually add to the overall latency. Hence, the architecture team needs to consider a number of critical characteristics before deciding if caching is suitable for its environment and business requirements. In this scenario, implementing an incremental cache update utilizing change data capture will eliminate the need for a full data refresh, yet still provide freshly updated data to the requesting application, while maintaining SLA requirements (see Figure 3).

While performance improvement is typically the most sought after benefit of the caching system in the data virtualization implementation, an overlooked, but equally important, advantage is the reduced impact or stress on the production systems. With the caching system enabled, many, if not most, of the client requests will be fulfilled by cached data, thus reducing the number of requests going against the production data sources. With high-request volumes, this additional benefit supplements the performance gain benefits of the caching system

Single Cache Instance Implementations
Single cache instance deployment is the most basic implementation in the data virtualization layer. Single-cache instances are typically preferred for small- to medium- sized departmental projects with low-to-moderate client load activity. As mentioned earlier, caching systems can improve data virtualization layer performance and reduce stress on the production data sources, depending on the implementation characteristics. If performance improvement is the primary objective, the implementation team should consider co-locating the cache on the same subnet as the data virtualization middleware, to minimize the network latency between the middleware and the cache. This is an important consideration because the caching system typically bears the brunt of the request load and hence the amount of traffic between the middleware and the caching system is expected to increase significantly. In situations where cached data is relatively small, but is accessed frequently, it may be beneficial to collocate the caching system with the middleware on the same blade server, thus eliminating network latency altogether. To further improve the performance of the cache collocated on the same blade, the cache database may be configured to pin the cache table into memory, and therefore further reduce the time needed to fetch the data from the cache (see Figure 4).

Depending on the topology and nature of the underlying data sources, the caching system may be configured to cache raw table data, materialized views or procedural data. Caching raw table data is suitable for environments where the performance of a single data source is significantly worse than the rest of the data sources, causing the data virtualization middleware to idle while waiting for a response. Caching the table data from slow data sources into the higher performance caching system improves the performance of the overall solution by removing the incremental latency delta associated with the idling middleware (see Figure 5). Materialized-view caching is most suitable when numerous clients send identical requests, therefore clogging production systems with requests that elicit identical responses. In such scenarios, the data virtualization middleware will execute the first client request as usual against the production systems, and then cache it, instead of discarding the returned result-set, so that subsequent client requests will be fulfilled by the cache system instead of the production systems. Finally, if one or more of the underlying data sources is a web service with long or unpredictable response latency, then enabling procedural caching will allow the data virtualization middleware to optimize the overall performance by caching the result-sets returned by the web service sources based on the passed parameters and thus eliminating potential web-service latency.

Cluster Cache Implementation
For more complex deployments, such as environments with heavy client request loads, a single instance of middleware and a single cache instance might be insufficient to handle all the requests within the allotted SLA. In such cases, the most common approach is to cluster the data virtualization middleware into multiple nodes. Although middleware clustering adds capacity to handle additional client requests, it also exacerbates the load on the production data sources, because each individual client request, even if the subsequent requests are identical, is executed against the production data sources. Enabling a caching system in a clustered environment, therefore, will potentially have a significant impact on the solution's performance as well as on offloading stress from the production systems (see Figure 6).

Distributed Cache Implementation
Finally, in environments where one or multiple clients are located remotely, a distributed caching system helps reduce the network latency associated with frequent requests over long networks. Such a distributed caching system typically has a central cache repository and multiple remote edge caches for servicing requests from the remote clients. There is usually no need for the edge caches to replicate the central cache system one to one -because the edge cache system monitors remote client requests, it can simply replicate the portion of the central cache that is relevant to its client requests. After initial replication, edge caches register change data capture requests with the central cache and are notified automatically whenever the central cache data changes, thus eliminating the need for a complete edge cache re-sync (see Figure 7).

Conclusion
As global enterprises and government agencies implement data virtualization to federate data across disparate systems and geographic locations, IT teams are considering the data virtualization layer's performance in relation to the overall information system. By using advanced data virtualization middleware with high-performance caching, IT architects can reduce potential performance bottlenecks and thus ensure timely delivery of information.

Published Sep. 2, 2009— Reads 1,647
Copyright © 2009 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
About Avtandil Garakanidze
Avtandil Garakanidze currently serves as the Vice President of Product Management and Strategy at Composite Software Inc, the leader in data virtualization solutions. Prior to Composite, he held executive and senior product and engineering management positions with high-tech companies including Symantec/VERITAS, Siebel Systems, Yahoo! and Starfish Software. Garakanidze earned an MBA from MIT’s Sloan School of Management and an MS from the Georgian Technical University.

Add Your Feedback

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers

ADS BY GOOGLE

Breaking Java News
Cannasat Therapeutics Reports Results for the Nine Months Ended September 30, 2009
Wal-Mart Court Ruling a Narrow Technical Victory
The Week Ahead for The Department of Justice for November 30 - December 4, 2009
Government of Canada, Government of Yukon and Communities Support Improvements to Recreational Facilities in Yukon
Norstar Securities Trust Announces Third Quarter Results
Canadian Pacific announces industry-leading biodiesel testing underway
Burnsville, MN Mayor Elizabeth Kautz Represents U.S. Mayors at EUROCITIES Meeting
LG Electronics Canada Welcomes Formula 1(TM) Back to Montreal
FDA Approves Agriflu Seasonal Influenza Vaccine
Source Gold Corp. Identifies High Grade Gold and Copper From Initial Sampling Program

ADVERTISE   |   MAGAZINE SUBSCRIPTIONS   |   FREE BREAKING-NEWSLETTERS!   |   SYS-CON.TV   |   BLOG-N-PLAY!   |   WEBCAST   |   EDUCATION   |   RESEARCH

.NET Developer's Journal - .NETDJ   |   ColdFusion Developer's Journal - CFDJ   |   Eclipse Developer's Journal - EDJ   |   Enterprise Open Source Magazine - EOS
Open Web Developer's Journal - OPENWEB   |   iPhone Developer's Journal - iPHONE   |   Virtualization - Virtualization   |   Java Developer's Journal - JDJ   |   Linux.SYS-CON.com
PowerBuilder Developer's Journal - PBDJ   |   SEO / SEM Journal - SJ   |   SOAWorld Magazine - SOAWM   |   IT Solutions Guide - ITSG   |   Symbian Developer's Journal - SDJ
WebLogic Developer's Journal - WLDJ   |   WebSphere Journal - WJ   |   Wireless Business & Technology - WBT   |   XML-Journal - XMLJ   |   Internet Video - iTV
Flex Developer's Journal - Flex   |   AJAXWorld Magazine - AWM   |   Silverlight Developer's Journal - SLDJ   |   PHP.SYS-CON.com   |   Web 2.0 Journal - WEB2
Apache   |   CMS   |   CRM   |   HP   |   Oracle Journal   |   Perl   |   Python   |   Red Hat   |   Ruby on Rails   |   SAP   |   SaaS

SYS-CON MEDIA:   ABOUT US   |   CONTACT US   |   COMPANY NEWS   |   CAREERS   |   SITE MAP
SYS-CON EVENTS:   |  AJAXWorld Conference & Expo  |  iPhone Developer Summit  |  Cloud Computing Conference & Expo  |  SOA World Conference & Expo  |  Virtualization Conference & Expo
INTERNATIONAL SITES:   India  |  U.K.  |  Canada  |  Germany  |  France  |  Australia  |  Italy  |  Spain  |  Netherlands  |  Brazil  |  Belgium
 Terms of Use & Our Privacy Statement     About Newsfeeds / Video Feeds
Copyright ©1994-2008 SYS-CON Publications, Inc. All Rights Reserved. All marks are trademarks of SYS-CON Media.
Reproduction in whole or in part in any form or medium without express written permission of SYS-CON Publications, Inc. is prohibited.
 
close this window