|By Maureen O'Gara||
|July 16, 2012 08:15 AM EDT||
The champagne has been flowing over at MapR since Google announced the integration of its Distribution for Hadoop with Google Compute Engine, the start-up's second big win in a row.
Amazon Web Services' has extended its Elastic MapReduce (EMR) service to include MapR as the only Hadoop distribution Amazon is offering, selling and supporting as a service.
Coupled with the Google win, a clear slight to Cloudera, the original Hadoop commercializer, MapR figures it's the de facto standard for Hadoop in the cloud.
The combination of the new Google Compute Engine and MapR enables users to quickly provision large MapR clusters for Big Data analytics on-demand in the cloud.
Since Hadoop was inspired by Google's own MapReduce and file system it's a puzzle Google went out-of-house for the widgetry rather than commercialize its own. MapR doesn't have an answer to that question.
MapR did however demonstrate a significant price/performance breakthrough at Google I/O completing a one terabyte (TB) TeraSort job in 1 minute 20 seconds, a result achieved on a Google Compute Engine cluster with 1,256 nodes, 1,256 disks and 5,024 cores at the measly cost of $16.
Compare this result with the existing world record of one minute two seconds on a physical cluster with more than four times the disks, twice as many cores and 200 more servers at a cost of more than $5 million.
The integration of MapR with Google Compute Engine includes a menu of MapR compute configurations where customers can store, manage and analyze large volumes of data in the cloud. Customers can pay on-demand and spin up 1,000+ node clusters.
Currently MapR on Google Compute Engine is a free private beta. See www.mapr.com/google.
MapR 2.0, also in beta, should be out this quarter.