|By Tom Leyden||
|February 27, 2014 09:30 AM EST||
The storage industry is going through a big paradigm shift caused by drastic changes in how we generate and consume data. As a result, we also have to drastically change how we store data: the market needs massive, online storage pools that can be accessed from anywhere and anytime. Object Storage has emerged as a solution to meet the changing needs of the market and it is currently a hot topic as it creates opportunities for new revenue streams.
In this three-part blog series, I will explore how storage has changed – creating a need for new methodologies – and why object storage is the prevalent platform for scale-out storage infrastructures.
To understand how storage has changed, let’s take a look at how data has evolved over the past three decades, paying special attention to data generation and consumption.
In the 1980s and 1990s, the most valuable digital data was transactional data – database records, created and accessed through database applications. This led to the success of large database and database application companies. Transactional data continues to be important today, but there are no signs on the horizon that database solutions won’t be able to manage the – relatively slow – growth of structured information. From a storage point of view, the structured data challenge is handled well by block-based (SAN) storage platforms, designed to deliver the high IOPS needed to run large enterprise databases.
With the advent of the office suite, unstructured data became much more important than it had ever been before. Halfway the 1990s, every office worker had a desktop computer with an office suite. E-mail allowed us to send those files around; storage consumption went through the roof. Enterprises would soon be challenged to build shared file storage infrastructures – backup and archiving became another challenge. Tiered storage was born. Storage was both hot and cool. In the next two decades we would see plenty of innovations to manage fast-growing unstructured data sets – the file storage (NAS) industry skyrocketed.
But people can only generate so many office documents. The average Powerpoint file is probably three times as big today as it was back in 1999, but that is not even close to data growth predictions we continue to hear (x2 every year). Just like SANs have evolved sufficiently to cope with the changing database requirements, NAS platforms would have been able to cope with the growth of unstructured data if it weren’t for the sensor-induced Big Data evolution of the past decade.
The first mentions of Big Data refer to what we now understand as Big Data Analytics: scientists (mostly) were challenged to store research data from innovative information-sensing devices, captured for analytics purposes. Traditional databases would not scale sufficiently for this data, so alternative methods were needed. This led to innovations like Hadoop/MapReduce, which we also like to refer to as Big “semi-structured” Data: the data is not structured as in a database, but it is not really unstructured either.
Information-sensing devices are not exclusive to scientific analytics environments, however. Smartphones, tablets, photo cameras and scanners – just to name a few – are all information-sensing devices that create the vast majority of all unstructured information generated today. In the past decade we have not only seen a massive increase in the popularity of these devices, but also continuous quality improvements. This led to more and bigger data. The result of this is a true data explosion of mostly immutable data: contrary to office documents, most of the sensor data is never changed.
This immutable nature of unstructured data holds the key to solving the scalability problem of traditional file storage. Tune into my next posts, where I will dive into how to leverage this aspect of enterprise data to develop an object storage solution for the shifting storage paradigm.