SYS-CON MEDIA Authors: Doug Masi, Mat Mathews, PR.com Newswire, David Smith, Tim Crawford

Related Topics: Cloud Expo, Java, Linux, Virtualization, Big Data Journal, SDN Journal

Cloud Expo: Blog Feed Post

Internet of Things, Fast Data vs. BigData

We all know the 3Vs associated with the term Big Data – volume, velocity and variety

Back when we were doing DB2 at IBM, there was an important older product called IMS which brought significant revenue. With another database product coming (based on relational technology), IBM did not want any cannibalization of the existing revenue stream. Hence we coined the phrase “dual database strategy” to justify the need for both DBMS products. In a similar vain, several vendors are concocting all kinds of terms and strategies to justify newer products under the banner of Big Data.

One such phrase is Fast Data. We all know the 3Vs associated with the term Big Data – volume, velocity and variety. It is the middle V (velocity) that says data is not static, but is changing fast, like stock market data, satellite feeds, even sensor data coming from smart meters or an aircraft engine. The question always has been how to deal with such type of changing data (as opposed to static data typical in most enterprise systems of record).

Recently I was listening to a talk by IBM and VoltDB where VoltDB tried to justify the world of “Fast Data” as co-existing with “Big Data” which is narrowed to static data warehouse or “data lake” as IBM calls it. Again, they have chosen to pigeonhole Big Data into the world of HDFS, Netezza, Impala, and batch Map-Reduce. This way, they justify the phrase Fast Data as representing operational data that is changing fast. They call VoltDB as  “the fast, operational database” implying every other database solution as slow. Incumbents like IBM, Oracle, and SAP have introduced in-memory options for speed and even NoSQL databases can process very fast reads on distributed clusters.

VoltDB folks also tried to show how the two worlds (Fast Data and their version of Big Data) will coexist. The Fast Data side will ingest and interact on streams of inbound data, do real time data analysis and export to the data warehouse. They bragged about the performance benchmark of 1m tps on a 3-node cluster scaling to 2.4m on a 12-node system running in the SoftLayer cloud (owned by IBM). They also said that this solution is much faster than Amazon’s AWS cloud. The comparison is not apple-to-apple as the SoftLayer deployment is on bare metal compared to the AWS stack of software.

I wish they call this simply – real-time data analytics, as it is mostly read type transactions and not confuse with update-heavy workloads. We will wait and see how enterprises adopt this VoltDB-SoftLayer solution in addition to their existing OLTP solutions.

More Stories By Jnan Dash

Jnan Dash is Senior Advisor at EZShield Inc., Advisor at ScaleDB and Board Member at Compassites Software Solutions. He has lived in Silicon Valley since 1979. Formerly he was the Chief Strategy Officer (Consulting) at Curl Inc., before which he spent ten years at Oracle Corporation and was the Group Vice President, Systems Architecture and Technology till 2002. He was responsible for setting Oracle's core database and application server product directions and interacted with customers worldwide in translating future needs to product plans. Before that he spent 16 years at IBM. He blogs at http://jnandash.ulitzer.com.