SYS-CON MEDIA Authors: Pat Romanski, Gary Arora, Zakia Bouachraoui, Yeshim Deniz, Liz McMillan

Blog Feed Post

Protecting SaaS Revenue Through SLA Monitoring

One of the biggest nightmares for any service provider is to find themselves in SLA hell due to poor performance. An issue that negatively impacts end users’ experience is inevitably going to have an effect on a company’s business metrics, and when that happens, they’re going to be looking for someone to blame, and more importantly, and compensate them for that lost revenue.

The reasoning behind having comprehensive SLAs in place is not a difficult concept to grasp. Protection of one’s brand image and revenue stream(s) is obviously of paramount importance. Yet as the landscape of digital architecture grows more and more complex, companies are forced to outsource more functionality to third-party vendors, which in turn creates additional places where performance can go bad.

An SLA is designed to mitigate the risk of that outsourcing by holding vendors financially accountable for any performance degradations that affects the end users through objective SLA monitoring, grading, and governance. According to the 2017 State of SaaS report conducted by Tech Target, over 25 percent of respondents acknowledged that they had incurred financial penalties for failing to meet their SLAs, with the average amount in penalties rising above $350K.

With that much money on the line, the simple truth is that vendors cannot afford to be the cause of their customers’ poor performance.

To make matters worse, more than 10 percent of the respondents admitted that service disruptions led to the loss of a customer, illustrating how much poor performance can erode the trust that’s necessary for a customer-vendor relationship. No business can afford to allow their brand to be harmed by poor customer experiences, so having strict SLAs in place along with diligent SLA monitoring practices becomes an absolute necessity.

The latter part of that strategy – diligent SLA monitoring practices – is dependent upon having a powerful synthetic monitoring solution in place that can replicate the end user experience while measuring from both backbone and last mile locations. The backbone tests, which eliminate noise that is out of the vendor’s control (e.g. local ISP or user hardware issues), are the most valuable for SLA monitoring and validation, while last mile and real user measurements provide additional context by showing the actual end-user experience.

A Two-Pronged Approach to Monitoring

Meanwhile, SaaS vendors themselves must also have end user experience monitoring strategies in place, with a two-pronged approach: one is to ensure the health of their digital supply chain, and the other is to validate their SLA requirements by proving that they are not the cause of any disruptions in their clients’ customer experiences. These two complementary goals ultimately serve the underlying purpose of SLA monitoring – that is to minimize the amount of money penalties that a vendor must pay their customers in penalties.

This is the approach taken by Zscaler, the world’s largest cloud security platform, which helps some of the biggest companies and government agencies around the world securely transform their networks and applications. Given their service offering, Zscaler’s security applications obviously must be placed in the path between the end users and whatever application they’re using (i.e. video conferencing software, banking software, etc.). This means that should Zscaler’s own digital supply chain suffer a service disruption, it will likely cause a negative digital experience for the end user as well.

The Need for Synthetic SLA Monitoring

The prevalence of both first- and third-party services within everyone’s digital supply chain emphasizes the need for complete outside-in view of the end user experience; viewing solely from within one’s own network is incomplete, and only relying on real user monitoring will still leave gaps in visibility when trying to determine the root cause of the issue (i.e. who ultimately bears responsibility for the disruption).

By being able to synthetically test every step of the digital supply chain, a SaaS vendor such as Zscaler is able to spot potential performance degradations before they have an impact on the end user experience, and then drill down into the analytics to pinpoint the root cause of the issue and troubleshoot a solution. This aspect of SLA monitoring is crucial, as it allows Zscaler to head off any problems before they trigger an SLA breach. After all, the best way to avoid paying penalties on your performance is to always have great performance.

There are a number of different ways that Zscaler obtains the real-time, actionable insights that allow them to detect and fix issues as quickly as possible. One crucial aspect is testing from as close as possible to the physical location of the end user(s).

http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-node-map-3... 300w, http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-node-map-7... 768w, http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-node-map.png 1036w" sizes="(max-width: 640px) 100vw, 640px" />

Many performance degradations are localized in specific geographies due to problems with single servers or datacenters, or peering issues with local networks and ISPs. When that’s the case, a performance test run from a different country or on a different ISP isn’t going to give you data that you can act on. Therefore, a testing infrastructure that provides a wide array of locations, ISPs, and cloud networks is vital to ensuring the end user experience.

Another important aspect for diagnosing and fixing performance issues is to have access to a wide range of test types and metrics. Once a performance alert goes off, an IT Ops/SRE must then drill deeper into the data to pinpoint the root cause, often by running different test types depending on the nature of the issue; for example, when an API fails, an API-specific test is in order; to pinpoint a network peering issue, a traceroute test is required.

http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-Traceroute... 300w, http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-Traceroute... 768w" sizes="(max-width: 640px) 100vw, 640px" />

However, effective SLA monitoring is about more than just ensuring that your own services are performing up to standards – it’s also about proving that you’re not responsible for other people’s failures.

SLA Monitoring Through Validation

Anyone who grew up with at least one sibling knows the value of passing the buck when something breaks. You know your little brother was the one who broke that lamp, but of course he doesn’t want to be punished, so he’s going to go out of his way to push the blame onto you. And unless you can prove it, it’s your word against his.

The same principle applies to business and digital performance, albeit with consequences much more severe than an early bedtime. When a company suffers a performance issue that results in loss of revenue and/or brand prestige, they’re naturally going to look for the culprit that’s responsible and tie it to an SLA breach in order to recoup some of that money. They’re going to be armed with data in these attempts, so vendors must be equally armed as well through their own SLA monitoring efforts. The name of the game, as it was when you were a kid, is to prove that it wasn’t your fault.

Once again, the answer lies with deployment of a thorough synthetic monitoring solution that can clearly and definitively articulate the root cause(s) of any performance problems during the post-mortem analysis.

When a vendor such as Zscaler is tasked with proving that they were not the source of a performance problem, one of the most important aspects is to be able to do so through data and charts that are easy to share and understand. Remember that these analyses and the business decisions that result are often being performed by people who don’t have the technical proficiency of a network engineer or SRE, so clear and obvious visual evidence is crucial.

http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-Waterfall-... 300w, http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-Waterfall-... 768w" sizes="(max-width: 640px) 100vw, 640px" />

Another helpful tactic for SLA monitoring is the ability to isolate first- and third-party content, and to be able to identify exactly who is responsible for the performance of all those third-parties. For example, if social sharing tag causes excessive delays in the loading of a website page, your synthetic monitoring solution should be able to pinpoint exactly what the tag is, who hosts it, and how much of a delay it caused.

http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-Zones-300x... 300w, http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-Zones-768x... 768w, http://blog.catchpoint.com/wp-content/uploads/2018/02/Zscaler-Zones.png 1173w" sizes="(max-width: 640px) 100vw, 640px" />

Finally, the ability to filter out extraneous noise through synthetic tests is vital to ensure accurate SLA monitoring. The simple fact is that some performance degradations are out of our hands; they can be caused by a weak home WiFi network, a damaged ISP cable, or something as simple as inclement weather that disrupts a mobile network. Here again, we see the importance of a synthetic “clean-room environment” that just looks at the customer-critical elements in the digital supply chain.

Don’t Get Blamed for Someone Else’s Mistake

The ultimate goal behind any vendor’s SLA monitoring strategy is to ensure that that you minimize the amount of penalties that you have to pay to your clients. With a strong synthetic monitoring platform in place, you should be able to catch issues as soon as they arise and fix them quickly, and demonstrate the root cause of issues that lie beyond your control and for which you are therefore not responsible. This two-pronged approach to SLA monitoring will save your company money in both the short- and long-term, and protect your brand’s prestige at the same time.

The post Protecting SaaS Revenue Through SLA Monitoring appeared first on Catchpoint's Blog - Web Performance Monitoring.

Read the original blog entry...

More Stories By Mehdi Daoudi

Catchpoint radically transforms the way businesses manage, monitor, and test the performance of online applications. Truly understand and improve user experience with clear visibility into complex, distributed online systems.

Founded in 2008 by four DoubleClick / Google executives with a passion for speed, reliability and overall better online experiences, Catchpoint has now become the most innovative provider of web performance testing and monitoring solutions. We are a team with expertise in designing, building, operating, scaling and monitoring highly transactional Internet services used by thousands of companies and impacting the experience of millions of users. Catchpoint is funded by top-tier venture capital firm, Battery Ventures, which has invested in category leaders such as Akamai, Omniture (Adobe Systems), Optimizely, Tealium, BazaarVoice, Marketo and many more.

Latest Stories
While a hybrid cloud can ease that transition, designing and deploy that hybrid cloud still offers challenges for organizations concerned about lack of available cloud skillsets within their organization. Managed service providers offer a unique opportunity to fill those gaps and get organizations of all sizes on a hybrid cloud that meets their comfort level, while delivering enhanced benefits for cost, efficiency, agility, mobility, and elasticity.
Isomorphic Software is the global leader in high-end, web-based business applications. We develop, market, and support the SmartClient & Smart GWT HTML5/Ajax platform, combining the productivity and performance of traditional desktop software with the simplicity and reach of the open web. With staff in 10 timezones, Isomorphic provides a global network of services related to our technology, with offerings ranging from turnkey application development to SLA-backed enterprise support. Leadin...
DevOps has long focused on reinventing the SDLC (e.g. with CI/CD, ARA, pipeline automation etc.), while reinvention of IT Ops has lagged. However, new approaches like Site Reliability Engineering, Observability, Containerization, Operations Analytics, and ML/AI are driving a resurgence of IT Ops. In this session our expert panel will focus on how these new ideas are [putting the Ops back in DevOps orbringing modern IT Ops to DevOps].
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
Enterprises are striving to become digital businesses for differentiated innovation and customer-centricity. Traditionally, they focused on digitizing processes and paper workflow. To be a disruptor and compete against new players, they need to gain insight into business data and innovate at scale. Cloud and cognitive technologies can help them leverage hidden data in SAP/ERP systems to fuel their businesses to accelerate digital transformation success.
Concerns about security, downtime and latency, budgets, and general unfamiliarity with cloud technologies continue to create hesitation for many organizations that truly need to be developing a cloud strategy. Hybrid cloud solutions are helping to elevate those concerns by enabling the combination or orchestration of two or more platforms, including on-premise infrastructure, private clouds and/or third-party, public cloud services. This gives organizations more comfort to begin their digital tr...
Most organizations are awash today in data and IT systems, yet they're still struggling mightily to use these invaluable assets to meet the rising demand for new digital solutions and customer experiences that drive innovation and growth. What's lacking are potent and effective ways to rapidly combine together on-premises IT and the numerous commercial clouds that the average organization has in place today into effective new business solutions.
Keeping an application running at scale can be a daunting task. When do you need to add more capacity? Larger databases? Additional servers? These questions get harder as the complexity of your application grows. Microservice based architectures and cloud-based dynamic infrastructures are technologies that help you keep your application running with high availability, even during times of extreme scaling. But real cloud success, at scale, requires much more than a basic lift-and-shift migrati...
David Friend is the co-founder and CEO of Wasabi, the hot cloud storage company that delivers fast, low-cost, and reliable cloud storage. Prior to Wasabi, David co-founded Carbonite, one of the world's leading cloud backup companies. A successful tech entrepreneur for more than 30 years, David got his start at ARP Instruments, a manufacturer of synthesizers for rock bands, where he worked with leading musicians of the day like Stevie Wonder, Pete Townsend of The Who, and Led Zeppelin. David has ...
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Addteq is a leader in providing business solutions to Enterprise clients. Addteq has been in the business for more than 10 years. Through the use of DevOps automation, Addteq strives on creating innovative solutions to solve business processes. Clients depend on Addteq to modernize the software delivery process by providing Atlassian solutions, create custom add-ons, conduct training, offer hosting, perform DevOps services, and provide overall support services.
Contino is a global technical consultancy that helps highly-regulated enterprises transform faster, modernizing their way of working through DevOps and cloud computing. They focus on building capability and assisting our clients to in-source strategic technology capability so they get to market quickly and build their own innovation engine.
When applications are hosted on servers, they produce immense quantities of logging data. Quality engineers should verify that apps are producing log data that is existent, correct, consumable, and complete. Otherwise, apps in production are not easily monitored, have issues that are difficult to detect, and cannot be corrected quickly. Tom Chavez presents the four steps that quality engineers should include in every test plan for apps that produce log output or other machine data. Learn the ste...
Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...