|By Ajay Budhraja||
|January 31, 2014 09:00 AM EST||
The Big Data and Cloud market has been growing at a staggering pace. Data is becoming unmanageable and too big to be handled by relational database systems alone and there is a need to effectively provision, manage elastic scalable systems. Information technology is undergoing a major shift due to new paradigms and a variety of delivery channels. The drivers for these technologies are social networks, proliferation of devices such as tablets and phones. Social business and collaboration are continuing to develop further to enhance productivity and interaction. There has been a big void in the Big data area and a need to come up with solutions that can manage Big Data. Part of the problem has been that there was so much focus on the user interfaces that not many organizations were thinking further about the core - Data. So now with the proliferation of large and unstructured data, it is important to extract and process large data sets from different systems expeditiously. To deliver strategic business value, there should be the capability to process Big data and have the analytics for enhanced decision making. In addition, systems that process Big Data can rely on the Cloud to rapidly provision and deploy elastic and scalable systems.
The key elements of a comprehensive strategy for Big Data, Open Data and Cloud includes conducting a cost benefit analysis, hiring resources with the right skills, evaluating requirements for data and analytics, developing a sound platform that can process and analyze large volumes of data quickly and developing strong analytic capabilities to respond to important business questions. A sound strategy also includes assessing the existing and future data, services, applications as well as the projected growth. In addition there should be a focus on ensuring that the infrastructure can support and store unstructured as well as structured data. As part of the strategy, data protection including security and privacy is very important. With the evolution to complex data sets, data can be compromised at the end points or while it is being transmitted. Hence proper security controls have to be developed to address these issues. Organizations also need to develop policies, practices and procedures that support the effective transition to these technologies.
As part of the strategic transition to Big Data and Cloud it is important to select a platform that can handle such data, parse through records quickly and provide adequate storage for the data. With the high velocity of data coming through systems, in memory analytics and fast processing are key elements that the platform should support. It should have good application development capabilities and the ability to effectively manage, provision systems and related monitoring. The platform should have components and connectors for Big Data to come up with integrated solutions. From a development perspective, Open source software such as Hadoop, Hive, Pig, R are being leveraged for Big Data. Hadoop was developed as a framework for the distributed processing of large data sets and to scale upwards. Hadoop can handle data from diverse systems including structured, unstructured, media. NoSQL is being used by organizations to store data that is not structured. In addition, there are vendors who offer proprietary software Hadoop solutions. The choice to go with a proprietary or open source solution depends on many factors and requires a through assessment.
Systems that process Big Data need the Cloud for rapid provisioning and deployment. The elastic and scalable aspects of the Cloud support the storage and management of massive amounts of data. The data can be obtained and stored in a Cloud based storage solution or database adapters can be used to obtain the data from databases with Hadoop, Pig, Hive. Vendors also offer data transfer services that move Big data from and to the Cloud. Cloud adds the dynamic computing, elasticity, self-service, measured aspects in addition to other aspects for rapid provisioning and on demand access. Cloud solutions may offer lower life cycle costs based on usage and the monitoring aspects can lay out a holistic view of usage, cost assessments and charge back information. All this information can enhance the ability of the organization to plan and react to changes based on performance and capacity metrics.
Open Data initiatives should be based on strong foundations of technologies such as Shared Services, Big Data and Cloud. There are initiatives underway related to Open data that drive the development and deployment of innovative applications. Making data accessible enables the development of new products and services. This data should be made available in a standardized manner so that developers can utilize it quickly and effectively. Open data maximizes value creation built on the existing structured and unstructured data.
Open Data strategy and initiatives should define specific requirements of what data will be made available based on the utility of that information. Just providing massive dumps of data that are hard to use is not the solution. There has to be proper processing that can extract useful information from the data. The data that is obtained should support automated processing to develop custom applications and can be rendered as html, xml etc. This can promote greater number of not just traditional applications, but also mobile applications. There has to be great emphasis on security and privacy since any errors can compromise important information when the data is made accessible. A comprehensive strategy for Big Data, Cloud and Open Data will enable a smooth transition to achieve big wins!
(This has been extracted from and is reference to blog. All views and information expressed here do not represent the positions and views of anyone else or any organization)