The i-Technology Media!
Register | Log in
   
 
.NET  ·  AJAX  ·  CLOUD  ·  ECLIPSE  ·  FLEX  ·  OPEN WEB  ·  iPHONE  ·  JAVA  ·  LINUX  ·  OPEN SOURCE  ·  ORACLE  ·  PBDJ  ·  SEARCH  ·  SILVERLIGHT  ·  SOA  ·  VIRTUALIZATION  ·  WEB 2.0  ·  WIRELESS  ·  XML
Comments
Drool, Britannia? Is the UK Failing the Cloud?
By Roger Strukhoff
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Jan. 8, 2012 11:38 AM EST
read more & respond »
Cloud Expo on Google News
Did you read today's front page stories & breaking news?

Cloud Expo & Virtualization 2011 West
Keynotes
Oracle
Opening Keynote | An Enterprise Cloud for Business-Critical Applications
Abiquo
Day 2 Keynote | The Enterprise Cloud Tightrope - Balancing for Success
Akamai
Day 3 Keynote | The DNA of an Enterprise Cloud
DIAMOND SPONSOR:
Oracle
Many Clouds, Many Choices'Cloud
PLATINUM PLUS SPONSORS:
Abiquo
Enterprise Cloud Best Practices - Town Hall - Join the discussion…
PLATINUM SPONSORS:
Intel
Progressing Toward the Federated, Automated and Client-Aware Cloud
New Relic
How to build an app with Twitter-like throughput
Rackspace
Computing in the Cloud Era
GOLD SPONSORS:
Gale Technologies
Practical Cloud Migration
IBM
Re-think IT. Re-inventing Business.
Intel/McAfee
Identity Driven Security in the Cloud
PerspecSys
Hackers Hackers Everywhere, Is My Public Cloud That Safe?
Red Hat
Unlock the Value of the Cloud
SHI
Mission Critical Applications and the Cloud - Myth or Reality?
SoftLayer
Not Your Grandpa's Cloud
Terremark
Integrating Enterprise Clouds
VMware
Upgrade to a vCloud
POWER PANELS:
Cloud Expo Silicon Valley: CTO Power Panel
Cloud Expo Silicon Valley: CEO Power Panel
Cloud Expo Silicon Valley: Cloud SuperStars Panel
Cloud Expo Silicon Valley: CloudNOW Panel
Click For 2010 West
Event Webcasts
Cloud Expo & Virtualization 2011 East
DIAMOND SPONSOR:
Dell
Dell & VMware Deliver the Enterprise Hybrid Cloud
PLATINUM PLUS SPONSORS:
Abiquo
Are Financial Services Organizations Risking Security by Avoiding Cloud Computing?
Oracle
From Consolidation to Enterprise Private PaaS
PLATINUM SPONSORS:
Intel
Driving the Transformation to Next Generation Cloud Data Centers
Rackspace
The Inevitability of an Open Cloud
GOLD SPONSORS:
CA Technologies
Follow YOUR path to Cloud Computing
Interxion
Who Keeps the Cloud in the Air?
Microsoft
Patterns for Cloud Computing
PerspecSys
War in the Clouds: Are you ready?
ServiceMesh
The Big Win: Stop Playing Small-Ball with Your Cloud Strategy
Terremark
Evaluating Enterprise Clouds
Xiotech
Cloud Storage: Myths and Realities
POWER PANELS:
Cloud Expo New York: CTO Power Panel
Cloud Expo New York: CEO Power Panel
Cloud Expo New York: CMO Power Panel
Cloud Expo New York: Wrap-Up Power Panel
Click For 2010 West
Event Webcasts
Live Google News by SYS-CON!
Top Three Links You Must Click On


Oh yeah? Well my database is SMALLER than your database!

By: Jerome Pineau
Aug. 17, 2009 04:27 PM

Contrary to popular edict, smaller is not always better, unless of course you’re talking about analytical database engines. In that respect, it’s hard to find an ADBMS that can fit on hard media like a CD or a USB stick. For example, I don’t think SQL Server, Oracle, DB2, Greenplum, Aster, ParAccel, or the myriad of other ADBMS vendors can fit all their bits in a tight spot. Even in the open source realm, I doubt you can wedge InfoBright (MySQL) or IceBreaker (Ingres) onto a stick, much less shlep their bits around as an email attachment.

One exception to this is the V-stick from Vertica. When I first read about this, I initially thought it was a hoax but apparently not. It’s pretty cool too because it includes the O/S, web server, GUI and the engine all together on a 16GB thumb drive. How an engine like Vertica, designed around distributed MPP, can possibly operate representatively (using terabyte-size data) on a thumb drive is beyond me, and I’ve never heard of anyone actually using this gizmo but I’d sure love to get my hands on one and review it if it’s still available.

The other exception of course is RDM/x, the XSPRADA database engine. The reason is simple: its total deployment footprint is around 10MB. That includes the 32/64 ODBC drivers and a couple DLLs. The engine itself is currently around 6MB. Last I looked the installer clocked in at 16,760KB. This means you can actually deploy RDM/x onto a memory stick if you want to. I tried it, it works. It’s pretty cool. But after a while I wondered, why would anyone care about this?

The reason is two-fold. First, it’s really easy to try out software that is small and self-contained without expanding large amounts of time and resources. Yes, you can download RDM/x from our website but in many cases (like secured firewalled enterprises), that’s not an option.

Second, it means we’re a good candidate for embedded applications. Because if I can fit my database engine on a stick (or in an email), I can probably embed it in instruments and devices as well either as raw C++ code or libraries.

But for quick POCs, size and simplicity really does matter. Say you’re suddenly tasked with evaluating solutions to deploy a BI solution inside your company. Suppose you’re a Microsoft shop. Suppose additional capex is not an option, and suppose further you have a week to show results (namely a set of nicely formatted reports, pivot tables or dashboards). Now what? If you have significant in-house experience with SQL Server and associated SSAS, SSIS, SSRS, and Excel (and assuming you have a clear and deep understanding of the business scope and goals to begin with) you’re probably going to:

(1) Figure out where your source data is coming from (connection strategies)

(2) Model your DW (figure out grain on facts, dimensions etc, need to figure out BIDS and SSAS)

(3) Establish some preliminary ETL process (including incremental loads, need to figure out SSIS)

(4) Load your warehouse (if you screw it up, then need to drop and do it over)

(5) Setup an SSAS cube structure (figure out SSAS via SSMS or BIDS then publish the thing)

(6) Figure out what queries to generate (talk to DW DBA or learn MDX)

(7) Figure out what BI tool to use (Excel or browser, depends on policies and audience)

(8) Generate the reports (canned or ad-hoc)/dashboards/pivot tables for the POC

Now, if you have no prior experience with the Microsoft BI toolset, and you can whip this little project up in a week, guess what, you need to quit your job and start a consulting company because clearly, as a NYC recruiter once told me “you’re so money”. But if you’re a normal person with little prior BI experience (and the terms ROLAP, MOLAP, SCD and MDX don’t ring a bell), you’re in a bind.

So another thing you can do is download a tiny analytical database (say, the XSPRADA RDM/x engine, for example) and throw, say, 100GB of data at it (this is just a small POC remember?), then plop Excel on top of it and generate some really cool reports or pivot tables to show the boss (in under a week) it can be done. How hard is that to do? This hard:

Figure out where your source data is coming from.

Yup, that one is pretty universal in the BI world. Difference here is all your data sources will export as CSV to feed the XSPRADA engine. So at least that’s consistent across all sources (be they structured, semi-structured or not). CSV is data format lingua-franca so your connection "strategy" is this: get everything out as CSV. Plain and simple.

Model your data warehouse.

That’s always a smart thing to do for obvious reasons although the XSPRADA engine is schema-agnostic and you can feed it normalized or star/snowflake models at will. The secret phrase is: “we don’t care”! So for a quick POC, if you find yourself "forced" to feed RDM/x a 3NF model, no worries.

Establish some preliminary ETL process.

RDM/x runs against initial CSV data islands directly off disk. Point to the CSV files using the XSPRADA SQL extensions for DDL and you’re done. You’ll likely be doing this via script or code (C++, Java or .NET to the ODBC driver directly or via a JDBC-ODBC bridge). For incremental loads, just plop the new CSV files on disk and point RDM/x to them using the INSERT INTO…FROM extension. This process can be done in real time without disruption while other queries are running. No hassle there.

Load your warehouse.

That’s executing a single line of SQL DDL code such as

CREATE TABLE ….FROM “c:\file1.csv;c:\file2.csv…c:\file32.csv”; or INSERT INTO…FROM “c:\file1.csv;c:\file2.csv…c:\file32.csv”;

Made a mistake of want to modify the schema and “reload” real quick? Not a problem. Simply re-issue the same DDL command and the table/schema is instantly updated. From a trial and error perspective (which, in a POC situation, is fairly typical), that’s a high-five.

Setup an SSAS cube structure (figure out SSAS via SSMS or BIDS)

There is no concept of cubes inside the XSPRADA engine. RDM/x automatically slices and dices based on incoming queries in real time. So if you want to “cube” just feed the engine slicing OLAP queries. RDM/x automatically restructures and aggregates in real time. No need to pre-define or pre-load cubes, deal with hierarchies or materialized views. I blogged about this earlier. RDM/x is a lot like Luke 11:9 – Ask and you shall receive.

Figure out what queries to generate (talk to DW DBA)

That’s where an external tool using MDX (along with an MDX expert!) can come in handy (most people don’t roll their own SQL for OLAP, although it can certainly be done in POC mode). One cool thing about RDM/x is its ability to “withstand” poorly-formulated SQL because the queries are optimized against the internal mathematical model. RDM/x is typically more “SQL-forgiving” than most other engines. And a poorly formulated query is likely transformed internally to still yield optimal performance. So even if you’re no SQL guru, the RDM/x engine is still on your side.

Figure out what BI tool to use (Excel, no brainer)

Connect Excel to the XSPRADA engine directly via ODBC or connect Mondrian to RDM/x (via bridge) then connect Excel to Mondrian via the SimbaO2X ODBO/XMLA connector. Alternatively, make the argument that using OSS like Pentaho or Jaspersoft against RDM/x directly is more flexible and accessible (not to mention cheaper!) than messing with Excel. Depending on your user base and corporate standards, that argument may or may not hold water.

Generate the reports/dashboards/KPI/Pivot Table/ad-hoc queries required by management.

Exactly the same way you would using any other tool and/or SQL.

At the end of the day (or in our case, the week), it’s all about “time to results” and “pain to results”. In those types of situations, smaller and simpler clearly has a significant advantage over the rest. And speaking of smaller, I have run over my allocated space for this posting :)

Read the original blog entry...

Published Aug. 17, 2009— Reads 492
Copyright © 2009 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
About Jerome Pineau
Twenty years of extensive hands-on software development, application engineering, customer interaction, management and consulting experience spanning a diverse array of industries and business models.

Now a "full-service" sales engineer, solutions architect, evangelist, technical ambassador (or whatever you want to call it) in the business intelligence space, specializing in high-performance analytical database management systems (ADBMS).

Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers

ADS BY GOOGLE

Breaking Java News
UNIVERSAL PICTURES, ILLUMINATION ENTERTAINMENT AND THE NATURE CONSERVANCY LAUNCH "THE LORAX SPEAKS" ENVIRONMENTAL ACTION CAMPAIGN ON FACEBOOK
German Technology Aims to Make Methadone Treatment More Safe and Efficient
High-Level Visit between China and US Bears Fruit with Their Economic and Trade Cooperation Pattern Taking on a New Look
The Wrap on Trains, Planes, Buses and Towers.
Interfor's Q4 Results Decline on Lower Volumes and Market Prices
15 Stories of Fame

ADVERTISE   |   MAGAZINE SUBSCRIPTIONS   |   FREE BREAKING-NEWSLETTERS!   |   SYS-CON.TV   |   BLOG-N-PLAY!   |   WEBCAST   |   EDUCATION   |   RESEARCH

.NET Developer's Journal - .NETDJ   |   ColdFusion Developer's Journal - CFDJ   |   Eclipse Developer's Journal - EDJ   |   Enterprise Open Source Magazine - EOS
Open Web Developer's Journal - OPENWEB   |   iPhone Developer's Journal - iPHONE   |   Virtualization - Virtualization   |   Java Developer's Journal - JDJ   |   Linux.SYS-CON.com
PowerBuilder Developer's Journal - PBDJ   |   SEO / SEM Journal - SJ   |   SOAWorld Magazine - SOAWM   |   IT Solutions Guide - ITSG   |   Symbian Developer's Journal - SDJ
WebLogic Developer's Journal - WLDJ   |   WebSphere Journal - WJ   |   Wireless Business & Technology - WBT   |   XML-Journal - XMLJ   |   Internet Video - iTV
Flex Developer's Journal - Flex   |   AJAXWorld Magazine - AWM   |   Silverlight Developer's Journal - SLDJ   |   PHP.SYS-CON.com   |   Web 2.0 Journal - WEB2
Apache   |   CMS   |   CRM   |   HP   |   Oracle Journal   |   Perl   |   Python   |   Red Hat   |   Ruby on Rails   |   SAP   |   SaaS

SYS-CON MEDIA:   ABOUT US   |   CONTACT US   |   COMPANY NEWS   |   CAREERS   |   SITE MAP
SYS-CON EVENTS:   |  AJAXWorld Conference & Expo  |  iPhone Developer Summit  |  Cloud Computing Conference & Expo  |  SOA World Conference & Expo  |  Virtualization Conference & Expo
INTERNATIONAL SITES:   India  |  U.K.  |  Canada  |  Germany  |  France  |  Australia  |  Italy  |  Spain  |  Netherlands  |  Brazil  |  Belgium
 Terms of Use & Our Privacy Statement     About Newsfeeds / Video Feeds
Copyright ©1994-2008 SYS-CON Publications, Inc. All Rights Reserved. All marks are trademarks of SYS-CON Media.
Reproduction in whole or in part in any form or medium without express written permission of SYS-CON Publications, Inc. is prohibited.
 
close this window