SYS-CON MEDIA Authors: Pat Romanski, Gary Arora, Zakia Bouachraoui, Yeshim Deniz, Liz McMillan

Blog Feed Post

VMware or Microsoft?–How robust is your availability?

Disclaimer: facts and figures in this article are based on the state of the technology as it exists at the date of its publication. 

Our article today in our “VMware or Microsoft?” series is about availability. 

When I say “availability”, I mean “high availability”. 

And when I say “robust high availability”, I mean a solution such as Windows Failover Clustering that provides high availability and scalability of server workloads.

I argue that Microsoft’s solution is robust and solid, but VMware has argued differently.  In a currently available document that VMware provides comparing vSphere 5 to the as-of-then beta of what is now Hyper-V in Windows Server 2012, VMware makes the claim that they have “robust high availability” with a “single click, [that] withstands multiple host failures”, whereas Microsoft’s Failover Clustering is “based on legacy quorum model, complex and brittle”. 

Really?  They haven’t been watching how far clustering has come in Windows Server lately.  In fact, at best, VMware’s document might be referring to how failover clustering used to work back in 2008.  More specifically, they are referring to the quorum model of how a cluster needs a majority vote to determine whether or not a node is actually unavailable, so that the resources it was managing can fail over to other nodes.  To ever have a solid majority, the number of voting members needs to be an odd number.  All nodes get a vote, and so if you have an even number of nodes, you need something else to break the tie.  So to make that work, you need some other “cluster witness”; which is either a “witness disk” or a “witness file share”. 

From this document on Windows Server 2008 failover clustering:

In a cluster with an even number of nodes and a quorum configuration that includes a witness, when the witness remains online, the cluster can continue sustain failures of half the nodes. If the witness goes offline, the same cluster can sustain failures of half the nodes minus one.

Well then, please allow me to introduce you to…

The Dynamic Quorum

“Batman and Robin?”

Tell me you didn't LOVE this show as a kid.No.. that was the “dynamic duo”.  I’m talking about the ability of all nodes in a Windows Failover Cluster to have a vote, and for the number of voting members to adjust dynamically as nodes fail; so that there is never any confusion (lack of a quorum) by having an even number of voting members.

Is this diagram…

Node & Disk Majority

…we see a healthy 4 node cluster, each running 2 VMs, or any other clustered roles.  (Windows Failover Clustering is not just for virtualization, you know.)  The quorum is maintained because we have a disk witness to break the tie in case two nodes say “one node is down!” and the other two say “no, he’s not!”.

If one of the nodes in our cluster goes away…

Simple Node Majority

…depending upon whether that removal was planned or a complete surprise, the clustered roles are able to failover or restart on other nodes.  AND, because the cluster now only has three active nodes, then that in itself becomes a quorum of voting members.

“When a node shuts down or crashes, the node loses its quorum vote.  When a node successfully rejoins the cluster, it regains its quorum vote.  By dynamically adjusting the assignment of quorum votes, the cluster can increase or decrease the number of quorum votes that are required to keep running. This enables the cluster to maintain availability during sequential node failures or shutdowns.”

Later, if either the node is re-added, it again gets a vote. 

Robust.  But wait… there’s more…

The Dynamic Witness

The story gets even better In Windows Server 2012 R2.  R2 improves with something called the “Dynamic Witness”:

“If the cluster is configured to use dynamic quorum (the default), the witness vote is also dynamically adjusted based on the number of voting nodes in current cluster membership. If there are an odd number of votes, the quorum witness does not have a vote. If there is an even number of votes, the quorum witness has a vote.

The quorum witness vote is also dynamically adjusted based on the state of the witness resource. If the witness resource is offline or failed, the cluster sets the witness vote to ‘0’.”

The benefit of this is for the rare case of a witness failure.  If that happens, the vote simply goes away and is assumed to not be there.  A huge benefit of all of this is that you never really have to worry about whether or not to count your nodes and the to configure a quorum witness or not. Just do it (as recommended), and let the dynamic nature of our failover clustering take care of it.

Guest Clustering Without Limits

Microsoft has a distinct advantage over VMware when it comes to guest clustering.  With Hyper-V and with virtual servers running Windows Server 2012 or 2012 R2, clusters of virtual machines can be created that use iSCSI, Fibre Channel, and even .VHDX files (in R2) as the location for their shared storage in either a Clustered Shared Volume (CSV) or just a server file share (SMB Share – file based storage). 

So here are a couple of the new, flexible choices you have for guest clustered VM shared storage in Windows Server 2012 R2…

Flexible choices for placement of Shared VHDX

Try doing that on NFS. 

While we’re on the subject of scale…

Does Size Matter?

VMware requires Essentials Plus or better for HA, and unless something else changed in vSphere 5.5 that they haven't yet said much about, I do believe they still can only support up to 4000 VMs in a 32 node cluster.  (Correct me in the comments and point me to documentation that proves me wrong, please.  I sincerely thought they would up their game here.) 

You can cluster up to 8,000 virtual machines in up to a 64 node cluster with Windows Server 2012 and Windows Failover Clustering.  And you can do it for no additional cost

---

“Holy robust high availability, Batman!”

I’m glad you like it.  But if not, or if you have any questions, let me know in the comments.

And for more details on what’s newer than what VMware would have you believe in the world of robust high-availability, check out these two TechNet documents:

What's New in Failover Clustering in Windows Server 2012

What's New in Failover Clustering in Windows Server 2012 R2

Read the original blog entry...

More Stories By Kevin Remde

Kevin is an engaging and highly sought-after speaker and webcaster who has landed several times on Microsoft's top 10 webcast list, and has delivered many top-scoring TechNet events and webcasts. In his past outside of Microsoft, Kevin has held positions such as software engineer, information systems professional, and information systems manager. He loves sharing helpful new solutions and technologies with his IT professional peers.

A prolific blogger, Kevin shares his thoughts, ideas and tips on his “Full of I.T.” blog (http://aka.ms/FullOfIT). He also contributes to and moderates the TechNet Forum IT Manager discussion (http://aka.ms/ITManager), and presents live TechNet Events throughout the central U.S. (http://www.technetevents.com). When he's not busy learning or blogging about new technologies, Kevin enjoys digital photography and videography, and sings in a band. (Q: Midlife crisis? A: More cowbell!) He continues to challenge his TechNet Event audiences to sing Karaoke with him.

Latest Stories
While a hybrid cloud can ease that transition, designing and deploy that hybrid cloud still offers challenges for organizations concerned about lack of available cloud skillsets within their organization. Managed service providers offer a unique opportunity to fill those gaps and get organizations of all sizes on a hybrid cloud that meets their comfort level, while delivering enhanced benefits for cost, efficiency, agility, mobility, and elasticity.
Isomorphic Software is the global leader in high-end, web-based business applications. We develop, market, and support the SmartClient & Smart GWT HTML5/Ajax platform, combining the productivity and performance of traditional desktop software with the simplicity and reach of the open web. With staff in 10 timezones, Isomorphic provides a global network of services related to our technology, with offerings ranging from turnkey application development to SLA-backed enterprise support. Leadin...
DevOps has long focused on reinventing the SDLC (e.g. with CI/CD, ARA, pipeline automation etc.), while reinvention of IT Ops has lagged. However, new approaches like Site Reliability Engineering, Observability, Containerization, Operations Analytics, and ML/AI are driving a resurgence of IT Ops. In this session our expert panel will focus on how these new ideas are [putting the Ops back in DevOps orbringing modern IT Ops to DevOps].
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
Enterprises are striving to become digital businesses for differentiated innovation and customer-centricity. Traditionally, they focused on digitizing processes and paper workflow. To be a disruptor and compete against new players, they need to gain insight into business data and innovate at scale. Cloud and cognitive technologies can help them leverage hidden data in SAP/ERP systems to fuel their businesses to accelerate digital transformation success.
Concerns about security, downtime and latency, budgets, and general unfamiliarity with cloud technologies continue to create hesitation for many organizations that truly need to be developing a cloud strategy. Hybrid cloud solutions are helping to elevate those concerns by enabling the combination or orchestration of two or more platforms, including on-premise infrastructure, private clouds and/or third-party, public cloud services. This gives organizations more comfort to begin their digital tr...
Most organizations are awash today in data and IT systems, yet they're still struggling mightily to use these invaluable assets to meet the rising demand for new digital solutions and customer experiences that drive innovation and growth. What's lacking are potent and effective ways to rapidly combine together on-premises IT and the numerous commercial clouds that the average organization has in place today into effective new business solutions.
Keeping an application running at scale can be a daunting task. When do you need to add more capacity? Larger databases? Additional servers? These questions get harder as the complexity of your application grows. Microservice based architectures and cloud-based dynamic infrastructures are technologies that help you keep your application running with high availability, even during times of extreme scaling. But real cloud success, at scale, requires much more than a basic lift-and-shift migrati...
David Friend is the co-founder and CEO of Wasabi, the hot cloud storage company that delivers fast, low-cost, and reliable cloud storage. Prior to Wasabi, David co-founded Carbonite, one of the world's leading cloud backup companies. A successful tech entrepreneur for more than 30 years, David got his start at ARP Instruments, a manufacturer of synthesizers for rock bands, where he worked with leading musicians of the day like Stevie Wonder, Pete Townsend of The Who, and Led Zeppelin. David has ...
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Addteq is a leader in providing business solutions to Enterprise clients. Addteq has been in the business for more than 10 years. Through the use of DevOps automation, Addteq strives on creating innovative solutions to solve business processes. Clients depend on Addteq to modernize the software delivery process by providing Atlassian solutions, create custom add-ons, conduct training, offer hosting, perform DevOps services, and provide overall support services.
Contino is a global technical consultancy that helps highly-regulated enterprises transform faster, modernizing their way of working through DevOps and cloud computing. They focus on building capability and assisting our clients to in-source strategic technology capability so they get to market quickly and build their own innovation engine.
When applications are hosted on servers, they produce immense quantities of logging data. Quality engineers should verify that apps are producing log data that is existent, correct, consumable, and complete. Otherwise, apps in production are not easily monitored, have issues that are difficult to detect, and cannot be corrected quickly. Tom Chavez presents the four steps that quality engineers should include in every test plan for apps that produce log output or other machine data. Learn the ste...
Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...