SYS-CON MEDIA Authors: Elizabeth White, PagerDuty Blog, Michael Jannery, Pat Romanski, Liz McMillan

Related Topics: Cloud Expo, SOA & WOA, Open Source, Apache

Cloud Expo: Article

EMC Puts Together 1,000-Node Hadoop Care Package

Greenplum wants to ensure that Hadoop turns into a really serious enterprise-ready Big Data tool

EMC, well, at least its Greenplum unit has put together a thousand-node multi-tenant analytics platform to accelerate the development and testing of the disruptive but persnickety open source Apache Hadoop software.

That's 1,000-odd hardware nodes or 10,000 nodes when you count virtual machines, along with 24PT of physical storage. In a word, that's huge.

Greenplum wants to ensure that Hadoop turns into a really serious enterprise-ready Big Data tool for sifting through mounds of unstructured data to unlock their secrets and make predictions but contends that nobody really knows how to deploy the stuff in part because everybody's Hadoop is different.

So it wants to come up with a formula others can follow and overturn the long-standing Hadoop dogma that one can't have separate compute and storage on the same nodes. Greenplum says it increases efficiency and requires less hardware, pointing to Yahoo's 40,000 Hadoop servers, which only offer 10%-15% utility. It thinks it can get 80%.

Anyway, Greenplum wants various user cases developed around analytics.

Intel, VMware, Micron, Seagate, Super Micro Computer and Mellanox Technologies are contributing to the so-called Greenplum Analytics Workbench project. Figure on continually updated versions of Hadoop, Ethernet and InfiniBand, a "kickbutt" interconnect, and Sandybridge racks, 60 of ‘em.

Greenplum wouldn't put a price tag on the thing but flat out it'll be the largest test-bed cluster for continuous integration tests on the Apache Hadoop trunk and its future releases.

The initial object of the exercise is scale or should we say contributions will be validated at scale so enterprises, which are often intimidated by Hadoop, can confidently deploy them in a production. It's supposed to identify bugs, stabilize new releases and optimize hardware configurations.

Greenplum says all testing and results will be contributed back to the Apache Software Foundation and the open source community, and its testing will be planned in coordination with Apache's key committers.

Continuous integration testing on the Greenplum Analytics Workbench will begin in January. Its use will be free and hosted at the giant Switch Communications SuperNAP data center in Las Vegas. EMC's Mozy folks will manage the thing.

More Stories By Maureen O'Gara

Maureen O'Gara the most read technology reporter for the past 20 years, is the Cloud Computing and Virtualization News Desk editor of SYS-CON Media. She is the publisher of famous "Billygrams" and the editor-in-chief of "Client/Server News" for more than a decade. One of the most respected technology reporters in the business, Maureen can be reached by email at maureen(at)sys-con.com or paperboy(at)g2news.com, and by phone at 516 759-7025. Twitter: @MaureenOGara

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Latest Stories
"ElasticBox is an enterprise company that makes it very easy for developers and IT ops to collaborate to develop, build and deploy applications on any cloud - private, public or hybrid," stated Monish Sharma, VP of Customer Success at ElasticBox, in this SYS-CON.tv interview at DevOps Summit, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
Want to enable self-service provisioning of application environments in minutes that mirror production? Can you automatically provide rich data with code-level detail back to the developers when issues occur in production? In his session at DevOps Summit, David Tesar, Microsoft Technical Evangelist on Microsoft Azure and DevOps, will discuss how to accomplish this and more utilizing technologies such as Microsoft Azure, Visual Studio online, and Application Insights in this demo-heavy session.
Entuity®, a provider of enterprise-class network management solutions, today announced that it solidifies its position as a market leader through global enterprise customer acquisitions and a refined channel strategy. In 2014, Entuity increased new license revenues in EMEA by over 75 percent, and LATAM by over 125 percent as customers embraced Entuity for its highly automated solution and unified architecture. Entuity’s refined channel strategy focuses on even deeper strategic alignment with ke...
We are all here because we are sold on the transformative promise of The Cloud. But what good is all of this ephemeral, on-demand infrastructure if your usage doesn't actually improve the agility and speed of your business? How must Operations adapt in order to avoid stifling your Cloud initiative? In his session at DevOps Summit, Damon Edwards, co-founder and managing partner of the DTO Solutions, will highlight the successful organizational, process, and tooling patterns of high-performing c...
The 4th International DevOps Summit, co-located with16th International Cloud Expo – being held June 9-11, 2015, at the Javits Center in New York City, NY – announces that its Call for Papers is now open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's large...
Cloud Expo 2014 TV commercials will feature @ThingsExpo, which was launched in June, 2014 at New York City's Javits Center as the largest 'Internet of Things' event in the world.
“We help people build clusters, in the classical sense of the cluster. We help people put a full stack on top of every single one of those machines. We do the full bare metal install," explained Greg Bruno, Vice President of Engineering and co-founder of StackIQ, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
In this demo at 15th Cloud Expo, John Meza, Product Engineer at Esri, showed how Esri products hook into Hadoop cluster to allow you to do spatial analysis on the spatial data within your cluster, and he demonstrated rendering from a data center with ArcGIS Pro, a new product that has a brand new rendering engine.
"Blue Box has been around for 10-11 years, and last year we launched Blue Box Cloud. We like the term 'Private Cloud as a Service' because we think that embodies what we are launching as a product - it's a managed hosted private cloud," explained Giles Frith, Vice President of Customer Operations at Blue Box, in this SYS-CON.tv interview at DevOps Summit, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
"People are a lot more knowledgeable about APIs now. There are two types of people who work with APIs - IT people who want to use APIs for something internal and the product managers who want to do something outside APIs for people to connect to them," explained Roberto Medrano, Executive Vice President at SOA Software, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
The Software Defined Data Center (SDDC), which enables organizations to seamlessly run in a hybrid cloud model (public + private cloud), is here to stay. IDC estimates that the software-defined networking market will be valued at $3.7 billion by 2016. Security is a key component and benefit of the SDDC, and offers an opportunity to build security 'from the ground up' and weave it into the environment from day one. In his session at 16th Cloud Expo, Reuven Harrison, CTO and Co-Founder of Tufin,...
SYS-CON Media announced that Splunk, a provider of the leading software platform for real-time Operational Intelligence, has launched an ad campaign on Big Data Journal. Splunk software and cloud services enable organizations to search, monitor, analyze and visualize machine-generated big data coming from websites, applications, servers, networks, sensors and mobile devices. The ads focus on delivering ROI - how improved uptime delivered $6M in annual ROI, improving customer operations by minin...
The move in recent years to cloud computing services and architectures has added significant pace to the application development and deployment environment. When enterprise IT can spin up large computing instances in just minutes, developers can also design and deploy in small time frames that were unimaginable a few years ago. The consequent move toward lean, agile, and fast development leads to the need for the development and operations sides to work very closely together. Thus, DevOps become...
Puppet Labs on Wednesday released the DevOps Salary Report, based on salary data gathered from Puppet Labs' industry-recognized State of DevOps Report. The data confirms that market demand for DevOps skills is growing, and that DevOps engineers are among the highest paid IT practitioners today. That's because IT organizations today are grappling with how to be more agile and responsive to the business, while maintaining the stability of their infrastructure. DevOps practices, such as continuous ...