|By Don MacVittie||
|June 22, 2011 09:15 AM EDT||
There’s a whole lot of talk about cloud revolutionizing IT, and a whole lot of argument about public versus private cloud, even a considerable amount of talk about what belongs in the cloud. But not much talk about helping you determine what applications and storage are a good candidate to move there – considering all of the angles that matter to IT. This blog will focus on storage, the next one on applications, because I don’t want to bury you in a blog as long as a feature length article.
It amazes me when I see comments like “no one needs a datacenter” while the definition of what, exactly, cloud is still carries the weight of debate. For the purposes of this blog, we will limit the definition of cloud to Infrastructure as a Service (IaaS) – VM containers and the things to support them, or Storage containers and the things to support them. My reasoning is simple, in that the only other big category of “cloud” at the moment is SaaS, and since SaaS has been around for about a decade, you should already have a decision making process for outsourcing a given application to such a service. Salesforce.com and Google Docs are examples of what is filtered out by saying “SaaS is a different beast”. Hosting services and CDNs are a chunk of the market, but increasingly they are starting to look like IaaS, as they add functionality to meet the demands of their IT customers. So we’ll focus on the “new and shiny” aspects of cloud that have picked up a level of mainstream support.
Cloud Storage is here, and seems to be pretty ready for prime-time if you stick with the major providers. That’s good news, since instability in such a young market could spell its doom. The biggest problem with cloud storage is the fact that it is implemented as SOAP calls in almost every large provider. This was an architectural decision that had to be made, but between CIFS, NFS, iSCSI, and FCoE, it wasn’t necessary to choose a web service interface, so I’m unsure why nearly all vendors did. Conveniently, nearly immediately after it was clear the enterprise needed cloud storage to look like all the other storage in their datacenters, cloud storage gateways hit the market. In most cases these were brand new products by brand new companies, but some, like our ARX Cloud Extender, are additions to existing storage products that help leverage both internal and external assets. Utilizing a cloud storage gateway allows you to treat cloud storage like NAS or SAN disk (depending upon the gateway and cloud provider), which really opens it up to whatever use you need to make of the storage within your enterprise.
Generally speaking, WAN communications are slower than LAN communications. You can concoct situations where the LAN is slower, but most of us definitely do not live in those worlds. While compression and deduplication of data going over the WAN can help, it is best to assume that storage in the cloud will have slower access times than anything but tape sitting in your datacenter.
Also, if you’re going to be storing data on the public Internet, it needs to be protected from prying eyes. Once it leaves your building, there are so many places/ways it can be viewed, that the only real option is to encrypt on the way out. Products like ARX Cloud Extender take care of that for you, check with your cloud storage gateway vendor though to be certain they do the same. If they do not, an encryption engine will have to be installed to take care of encryption before sending out of the datacenter. If you end up having to do this, be aware that encryption can have huge impacts on compression and deduplication, because it changes the bytes themselves. Purchasing a gateway that does both makes order of operations work correctly – dedupe, then compress, then encrypt (assuming it supports all three).
With that said, utilizing compression, TCP optimizations, and deduplication will reduce the performance hit to manageable levels, and encryption mitigates the data-in-flight security risk going to the cloud, and the data-at-rest security risk while stored in the cloud. Between the two, it makes cloud storage appealing – or at least usable.
Our criteria then become pretty straight-forward… We want to send data to the cloud that either has no access time constraints (eg: can be slower to open), or will be faster than existing technology.These criteria leave you a couple of solid options for piloting cloud storage usage.
1. Backups or Archival Storage. They’re stored off-site, they’re encrypted, and even though they’re slower than LAN-based backups, they’re still disk to disk (D2D), so they’re faster backing up, and much faster for selective restores than tape libraries. If you already have a D2D solution in-house, this may be less appealing, but getting a copy into the cloud means that one major disaster can’t take out your primary datacenter and its backups.
2. Infrequently Accessed or Low Priority Storage. All that chaff that comes along with the goldmine of unstructured data in your organization is kept, because you will need some of it again some day, and IT is not in a position to predict which files you will need. By setting up a cloud storage Share or LUN that you use tiering or some other mechanism to direct those files to, the files are saved, but they’re not chewing up local disk. That increases available disk in the datacenter, but keeps the files available in much the same manner as archival storage. Implemented in conjunction with Directory Virtualization, the movement of these files can be invisible to the end users, as they will still appear in the same place in the global directory, but will physically be moved to the cloud if they are infrequently accessed.
Cloud storage is no more a panacea than any other technical solution, there’s just some stuff that should not be moved to the cloud today, perhaps not ever.
1. Access time sensitive files. Don’t put your database files in the cloud (though you might check out Microsoft Azure or Oracle’s equivalent offering). You won’t like the results. Remember, just because you can, doesn’t mean you should.
2. Data Critical to Business Continuity. Let’s face it, one major blow that takes out your WAN connection takes out your ability to access what’s stored in the cloud. So be careful that data needed for normal operation of the business is not off-site. It’s bad enough if access to the Internet is down, and public websites running on datacenter servers are inaccessible, but to have those files critical to the business – be it phone orders, customer support, whatever – must be available if the business is keeping its doors open. Redundant WAN connections can mitigate this issue (with a pricetag of course), but even those are not proof against all eventualities that impact only your Internet connectivity.
With cloud storage gateways and directory virtualization, there are definite “win” points for the use of cloud storage. Without directory virtualization, there are still some definite scenarios that will improve your storage architecture without breaking your back implementing them. Without a cloud storage gateway, most cloud storage is essentially useless to enterprise IT (other than AppDev) because none of your architecture knows how to make use of web services APIs.
But if you implement a cloud storage gateway, and choose wisely, you can save more in storage upgrades than the cost of the cloud storage. This is certainly true when you would have to upgrade local storage, and cloud storage just grows a little with a commensurate monthly fee. Since fee schedules and even what is charged for (bytes in/out, bytes at rest, phase of the moon) change over time, check with your preferred vendor to make certain cloud is the best option for your scenario, but remember that deploying a directory virtualization tool will increase utilization and tiering can help remove data from expensive tier one disk, possibly decreasing one of your most expensive architectural needs – the fast disk upgrade.
Next time we’ll look at applications from a general IT perspective, and I’m seriously considering extending this to a third blog discussing AppDev and the cloud.