|By Srinivasan Sundara Rajan||
|February 14, 2011 06:00 PM EST||
Data Warehousing As A Cloud Candidate
Over the past year, we have started seeing greater support for Cloud from major vendors and Cloud is here to stay. The bigger impact is that, the path is clearly drawn for the enterprises to adopt Cloud. With this in mind, it is time to identify the potential for existing data center applications to be migrated to Cloud.
Most of the major IT majors predict a HYBRID Delivery will be future, where by the future enterprises needs to look for a delivery model that comprises of certain work loads on Clouds and some of them continue to be on data centers and then look for a model that will integrate them together.
Before we go further into a blue print of How Data warehouses fit within a HYBRID Cloud environment, we will see the salient features of Data warehouses and how the Cloud tenants make them a very viable work load to be moved to Cloud.
A data warehouse is a subject oriented, integrated, time variant and non volatile collection of data in support of management's decision making process.
Data Warehousing Usage
Cloud Tenant Value Proposition
ETL (Extract, Cleaning, Transform, Load) process is subject to variable patterns. Normally we may get large files over the week end or in night time to be processed and loaded.
It is better to use the COMPUTE resources on demand for the ETL as they require , rather than having a fixed capacity
OLAP (Online Analytical Processing) and related processing needs for MOLAP (Multi dimensional OLAP) and / or ROLAP (Relational OLAP) are highly compute intensive and requires stronger processing needs
High Performance Computing and ability to scale up on demand, tenants of Cloud will be highly aligned to this need
Physical architecture needs are complex in a data warehousing environment.
Most of the IaaS , PaaS offerings like Azure platform, Amazon EC2 have built in provisions for a highly available architecture, with most of the day to day administration is abstracted from the enterprises.
The below are some of the advantages of SQL Azure Platform
Multiple Software and platform needs,
The product stack of data warehousing environment is really huge and most organizations will normally find it difficult to get into a ideal list of software and platforms and tools for their BI platform. platform. SaaS for applications like data cleansing or address validation and PaaS for reporting like Microsoft SQL Azure reporting will be ideal to solve the tools and platform maze.
The following are the ideal steps for migrating a in-premise data warehouse system to a cloud platform, for the sake of case study , Microsoft Windows Azure platform is chosen as the target platform.
1. Create Initial Database / Allocate Storage / Migrate Data
The existing STAR Schema design of the existing data warehousing system can be migrated to Cloud platform as it is. And migrating to a Relational database platform like SQL Azure should be straightforward. To migrate the data, the initial storage allocations of the existing database on the data center needs to be calculated and the same amount Storage resources will be allocated on the Cloud.
You can store any amount of data, from kilobytes to terabytes, in SQL Azure. However, individual databases are limited to 10 GB in size. To create solutions that store more than 10 GB of data, you must partition large data sets across multiple databases and use parallel queries to access the data.
Once a high scalable database infrastructure is setup on SQL Azure platform , the following are some of the methods in which the data from the existing on-premise data warehouses can be moved to SQL Azure.
Traditional BCP Tool : BCP is a command line utility that ships with Microsoft SQL Server. It bulk copies data between SQL Azure (or SQL Server) and a data file in a user-specified format. The bcp utility that ships with SQL Server 2008 R2 is fully supported by SQL Azure. You can use BCP to backup and restore your data on SQL Azure You can import large numbers of new rows into SQL Azure tables or export data out of tables into data files by using the bcp utility.
The following tools are also useful, if you existing Data warehouse is in Sql Server within the data center.
You can transfer data to SQL Azure by using SQL Server 2008 Integration Services (SSIS). SQL Server 2008 R2 or later supports the Import and Export Data Wizard and bulk copy for the transfer of data between an instance of Microsoft SQL Server and SQL Azure.
SQL Server Migration Assistant (SSMA for Access v4.2) supports migrating your schema and data from Microsoft Access to SQL Azure.
2. Set Up ETL & Integration With Existing On Premise Data Sources
After the initial load of the data warehouse on Cloud, it required to be continuously refreshed with the operational data. This process needs to extract data from different data sources (such as flat files, legacy databases, RDBMS, ERP, CRM and SCM application packages).
This process will also carry out necessary transformations such as joining of tables, sorting, applying various filters.
The following are typical options available in Sql Azure platform to build a ETL platform between the On Premise and data warehouse hosted on cloud. The tools mentioned above on the initial load of the data also holds good for ETL tool, however they are not repeated to avoid duplication.
SQL Azure Data Sync :
- Cloud to cloud synchronization
- Enterprise (on-premise) to cloud
- Cloud to on-premise.
- Bi-directional or sync-to-hub or sync-from-hub synchronization
The following diagram courtesy of Vendor will give a over view of how the SQL Azure Data Sync can be used for ETL purposes.
Integration provides common Biztalk Server integration capabilities (e.g. pipeline, transforms, adapters) on Windows Azure, using out-of-box integration patterns to accelerate and simplify development. It also delivers higher level business user enablement capabilities such as Business Activity Monitoring and Rules, as well as self-service trading partner community portal and provisioning of business-to-business pipelines. The following diagram courtesy of the vendor shows how the Windows Azure Appfabric Integration can be used as a ETL platform.
3. Create CUBES & Other Analytics Structures
The multi dimensional nature of OLAP requires a analytical engine to process the underlying data and create a multi dimensional view and the success of OLAP has resulted in a large number of vendors offering OLAP servers using different architectures.
MLOAP : A Proprietary multidimensional database with a aim on performance.
ROLAP : Relational OLAP is a technology that provides sophisticated multidimensional analysis that is performed on open relational databases. ROLAP can scale to large data sets in the terabyte range.
HOLAP : Hybrid OLAP is an attempt to combine some of the features of MOLAP and ROLAP technology.
SQL Azure Database does not support all of the features and data types found in SQL Server. Analysis Services, Replication, and Service Broker are not currently provided as services on the Windows Azure platform.
At this time there is no direct support for OLAP and CUBE processing on SQL Azure, however with the HPC (High Performance Computing ) attributes using multiple Worker roles, manually aggregation of the data can be achieved.
4. Generate Reports
Reporting consists of analyzing the data stored in the data warehouse in multiple dimensions and generate standard reports for business intelligence and also generate ad-hoc reports. These reports present data in graphical/tabular form and also provide statistical analysis features. These reports should be rendered as Excel, PDF and other formats.
It is better to utilize the SaaS based or PaaS based reporting infrastructure rather than custom coding all the reports.
SQL Azure Reporting enables developers to enhance their applications by embedding cloud based reports on information stored in a SQL Azure database. Developers can author reports using familiar SQL Server Reporting Services tools and then use these reports in their applications which may be on-premises or in the cloud.
SQL Azure Reporting also currently can connect only to SQL Azure databases.
The above steps will provide a path to migrate on premise Data warehousing applications to Cloud. As we needed lot of support from the vendor in terms of IaaS, PaaS and SaaS, Microsoft Azure Platform is chosen as a platform to support the case study. With several features integrated as part of this, Microsoft Cloud Platform positioned to be one of the leading platform for BI on Cloud.
The following diagram indicates a blue print of a typical Cloud BI Organization on a Microsoft Azure Platform.
WHOA.com has announced the newest addition to its data center footprint with the expansion into Equinix's newest state-of-the-art facility: DC-11 Washington, DC IBX+. Located in Ashburn, VA, this data center expands Whoa.com's presence to meet rapidly expanding customer demand for secure cloud solutions. Equinix, Inc. operates International Business Exchange™ (IBX®) data centers in 32 markets across 15 countries in the Americas, EMEA, and Asia-Pacific. Equinix is committed to operating faciliti...
Mar. 29, 2015 05:00 AM EDT Reads: 1,092
SYS-CON Events announced today that FierceDevOps will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. FierceDevOps keeps software developers and IT operations personnel updated on the latest news and trends around the rapidly evolving role of the traditional IT worker.
Mar. 29, 2015 02:45 AM EDT Reads: 1,402
GENBAND has announced that SageNet is leveraging the Nuvia platform to deliver Unified Communications as a Service (UCaaS) to its large base of retail and enterprise customers. Nuvia’s cloud-based solution provides SageNet’s customers with a full suite of business communications and collaboration tools. Two large national SageNet retail customers have recently signed up to deploy the Nuvia platform and the company will continue to sell the service to new and existing customers. Nuvia’s capabili...
Mar. 29, 2015 01:00 AM EDT Reads: 1,434
Hosted PaaS providers have given independent developers and startups huge advantages in efficiency and reduced time-to-market over their more process-bound counterparts in enterprises. Software frameworks are now available that allow enterprise IT departments to provide these same advantages for developers in their own organization. In his workshop session at DevOps Summit, Troy Topnik, ActiveState’s Technical Product Manager, will show how on-prem or cloud-hosted Private PaaS can enable organ...
Mar. 28, 2015 11:45 PM EDT Reads: 1,141
The WebRTC Summit 2014 New York, to be held June 9-11, 2015, at the Javits Center in New York, NY, announces that its Call for Papers is open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 16th International Cloud Expo, @ThingsExpo, Big Data Expo, and DevOps Summit.
Mar. 28, 2015 11:00 PM EDT Reads: 1,524
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
Mar. 28, 2015 08:00 PM EDT Reads: 1,770
SYS-CON Events announced today that Cisco, the worldwide leader in IT that transforms how people connect, communicate and collaborate, has been named “Gold Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Cisco makes amazing things happen by connecting the unconnected. Cisco has shaped the future of the Internet by becoming the worldwide leader in transforming how people connect, communicate and collaborat...
Mar. 28, 2015 07:00 PM EDT Reads: 5,176
WSM International is launching a DevOps services division that offers assessment, consulting and implementation to large enterprises and organizations with complex infrastructures. This is the first independent services company to create a dedicated practice to help organizations looking to transition to the DevOps model. The concept of DevOps is to blend information technology (IT) software development with operations to optimize the computing infrastructure according to the specific needs of ...
Mar. 28, 2015 07:00 PM EDT Reads: 1,493
SYS-CON Events announced today that the DevOps Institute has been named “Association Sponsor” of SYS-CON's DevOps Summit, which will take place on June 9–11, 2015, at the Javits Center in New York City, NY. The DevOps Institute provides enterprise level training and certification. Working with thought leaders from the DevOps community, the IT Service Management field and the IT training market, the DevOps Institute is setting the standard in quality for DevOps education and training.
Mar. 28, 2015 06:30 PM EDT Reads: 1,002
Wearable technology was dominant at this year’s International Consumer Electronics Show (CES) , and MWC was no exception to this trend. New versions of favorites, such as the Samsung Gear (three new products were released: the Gear 2, the Gear 2 Neo and the Gear Fit), shared the limelight with new wearables like Pebble Time Steel (the new premium version of the company’s previously released smartwatch) and the LG Watch Urbane. The most dramatic difference at MWC was an emphasis on presenting we...
Mar. 28, 2015 06:00 PM EDT Reads: 1,333
SYS-CON Events announced today that robomq.io will exhibit at SYS-CON's @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. robomq.io is an interoperable and composable platform that connects any device to any application. It helps systems integrators and the solution providers build new and innovative products and service for industries requiring monitoring or intelligence from devices and sensors.
Mar. 28, 2015 06:00 PM EDT Reads: 1,445
Temasys has announced senior management additions to its team. Joining are David Holloway as Vice President of Commercial and Nadine Yap as Vice President of Product. Over the past 12 months Temasys has doubled in size as it adds new customers and expands the development of its Skylink platform. Skylink leads the charge to move WebRTC, traditionally seen as a desktop, browser based technology, to become a ubiquitous web communications technology on web and mobile, as well as Internet of Things...
Mar. 28, 2015 06:00 PM EDT Reads: 1,803
Today, IT is not just a cost center. IT is an enabler and driver of business. With the emergence of the hybrid cloud paradigm, IT now has increasingly more capabilities to create new strategic opportunities for a business. Hybrid cloud allows an organization to utilize multi-tenant public clouds, dedicated private clouds, bare metal hosting, and the associated support and services for the right use cases through an on-demand, XaaS model. This model of IT creates tremendous opportunities for busi...
Mar. 28, 2015 05:00 PM EDT Reads: 3,067
Docker is an excellent platform for organizations interested in running microservices. It offers portability and consistency between development and production environments, quick provisioning times, and a simple way to isolate services. In his session at DevOps Summit at 16th Cloud Expo, Shannon Williams, co-founder of Rancher Labs, will walk through these and other benefits of using Docker to run microservices, and provide an overview of RancherOS, a minimalist distribution of Linux designed...
Mar. 28, 2015 04:15 PM EDT Reads: 2,409
Business as usual for IT is evolving into a “Make or Buy” decision on a service-by-service conversation with input from the LOBs. How does your organization move forward with cloud? In his general session at 16th Cloud Expo, Paul Maravei, Regional Sales Manager, Hybrid Cloud and Managed Services at Cisco, discusses how Cisco and its partners offer a market-leading portfolio and ecosystem of cloud infrastructure and application services that allow you to uniquely and securely combine cloud busi...
Mar. 28, 2015 04:15 PM EDT Reads: 1,304