SYS-CON MEDIA Authors: Pat Romanski, Yeshim Deniz, Elizabeth White, Liz McMillan, William Schmarzo

Blog Feed Post

What Ops Needs to Know about APIs and Compression

I’ve been reading up on APIs cause, coolness. And in particular I really enjoyed reading Best Practices for Designing a Pragmatic RESTful API because it had a lot of really good information and advice.

And then I got to the part about compressing your APIs.

Before we go too far let me first say I’m not saying you shouldn’t compress your API or app responses. You probably should. What I am saying is that where you compress data and when are important considerations.

That’s because generally speaking no one has put their web server (which is ultimately what tends to serve up responses, whether they’re APIs or objects, XML or JSON) at the edge of the Internet. You know, where it’s completely vulnerable. It’s usually several devices back in the networking gauntlet that has be run before data gets from the edge of your network to the server.

wrong architectureThis is because there are myriad bad actors out salivating at the prospect of a return to an early aughts data center architecture in which firewalls, DDoS protection, and other app security services were not physically and logically located upstream from the apps they protect today.

Cause if you don’t have to navigate the network, it’s way easier to launch an attack on an app.

Today, we employ an average of 11 different services in the network, upstream from the app, to provide security, scale, and performance-enhancing services. Like compression. better architecture 

Now, you can enable compression on the web server. It’s a standard thing in HTTP and it’s little more than a bit to flip in the configuration. Easy peasy performance-enhancing change, right?

Except that today that’s not always true.

The primary reason compression improves performance is because when it reduces the size of data it reduces the number of packets that must be transmitted. That reduces the potential for congestion that causes a Catch-22 where TCP retransmits increase congestion that increases packet loss that increases… well, you get the picture. This is particularly true when mobile clients are connecting via cellular networks, because latency is a real issue for them and the more round trips it takes, the worse the application experience.

Suffice to say that the primary reason compression improves performance is that it reduces the amount of data needing to be transmitted which means “faster” delivery to the client. Fewer packets = less time = happier users.

That’s a good thing. Except when compression gets in the way or doesn’t provide any real reduction that would improve performance.

What? How can that be, you ask.

Remember that we’re looking for compression to reduce the number of packets transmitted, especially when it has to traverse a higher latency, lower capacity link between the data center and the client.

It turns out that sometimes compression doesn’t really help with that.

Consider the aforementioned article and its section on compressing. The author ran some tests, and concluded that compression of text-based data produces some really awesome results:

Let's look at this with a real world example. I've pulled some data from GitHub's API, which uses pretty print by default. I'll also be doing some gzip comparisons:


$ curl https://api.github.com/users/veesahni > with-whitespace.txt
$ ruby -r json -e 'puts JSON JSON.parse(STDIN.read)' < with-whitespace.txt > without-whitespace.txt
$ gzip -c with-whitespace.txt > with-whitespace.txt.gz
$ gzip -c without-whitespace.txt > without-whitespace.txt.gz

The output files have the following sizes:

  • without-whitespace.txt - 1252 bytes
  • with-whitespace.txt - 1369 bytes
  • without-whitespace.txt.gz - 496 bytes
  • with-whitespace.txt.gz - 509 bytes

In this example, the whitespace increased the output size by 8.5% when gzip is not in play and 2.6% when gzip is in play. On the other hand, the act of gzipping in itself provided over 60% in bandwidth savings. Since the cost of pretty printing is relatively small, it's best to pretty print by default and ensure gzip compression is supported!

To further hammer in this point, Twitter found that there was an 80% savings (in some cases)when enabling gzip compression on their Streaming API. Stack Exchange went as far as to never return a response that's not compressed!

 

Wow! I mean, from a purely mathematical perspective, that’s some awesome results. And the author is correct in saying it will provide bandwidth savings.

What those results won’t necessarily do is improve performance because the original size of the file was already less than the MSS for a single packet. Which means compressed or not, that data takes exactly one packet to transmit. That’s it. I won’t bore you with the mathematics, but the speed of light and networking says one packet takes the same amount of time to transit whether it’s got 496 bytes of payload or 1396 bytes of payload. The typical MSS for Ethernet packets is 1460 bytes, which means compressing something smaller than that effectively nets you nothing in terms of performance. It’s like a plane. It takes as long to fly from point A to point B whether there are 14 passengers or 140. Fuel efficiency (bandwidth) is impacted, but that doesn’t really change performance, just the cost.

Furthermore, compressing the payload at the web server means that web app security services upstream have to decompress if they want to do their job, which is to say scan responses for sensitive or excessive data indicative of a breach of security policies. This is a big deal, kids. 42% of respondents in our annual State of Application Delivery survey always scan responses as part of their overall attack-surfaces-soad-2016security strategy to prevent data leaks. Which means they have to spend extra time to decompress the data to evaluate it and then recompress it, or perhaps they can’t inspect it at all.

Now, that said, bandwidth savings are a good thing. It’s part of any comprehensive scaling strategy to consider the impact of increasing use of an app on bandwidth. And a clogged up network can impact performance negatively so compression is a good idea. But not necessarily at the web server. This is akin to carefully considering where you enforce SSL/TLS security measures, as there are similar impacts on security services upstream from the app / web server.

That’s why the right place for compression and SSL/TLS is generally upstream, in the network, after security has checked out the response and it’s actually ready to be delivered to the client. That’s usually the load balancing service or the ADC, where compression can not only be applied most efficiently and without interfering with security services and offsetting the potential gains by forcing extra processing upstream.

As with rate limiting APIs, it’s not always a matter of whether or not you should, it’s a matter of where you should.

Architecture, not algorithms, are the key to scale and performance of modern applications. 

Read the original blog entry...

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

Latest Stories
Despite being the market leader, we recognized the need to transform and reinvent our business at Dynatrace, before someone else disrupted the market. Over the course of three years, we changed everything - our technology, our culture and our brand image. In this session we'll discuss how we navigated through our own innovator's dilemma, and share takeaways from our experience that you can apply to your own organization.
DXWorldEXPO LLC announced today that Nutanix has been named "Platinum Sponsor" of CloudEXPO | DevOpsSUMMIT | DXWorldEXPO New York, which will take place November 12-13, 2018 in New York City. Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that power their business. The Nutanix Enterprise Cloud Platform blends web-scale engineering and consumer-grade design to natively converge server, storage, virtualization and networking into a resilient, softwar...
Founded in 2002 and headquartered in Chicago, Nexum® takes a comprehensive approach to security. Nexum approaches business with one simple statement: “Do what’s right for the customer and success will follow.” Nexum helps you mitigate risks, protect your data, increase business continuity and meet your unique business objectives by: Detecting and preventing network threats, intrusions and disruptions Equipping you with the information, tools, training and resources you need to effectively m...
Having been in the web hosting industry since 2002, dhosting has gained a great deal of experience while working on a wide range of projects. This experience has enabled the company to develop our amazing new product, which they are now excited to present! Among dHosting's greatest achievements, they can include the development of their own hosting panel, the building of their fully redundant server system, and the creation of dhHosting's unique product, Dynamic Edge.
The Transparent Cloud-computing Consortium (T-Cloud) is a neutral organization for researching new computing models and business opportunities in IoT era. In his session, Ikuo Nakagawa, Co-Founder and Board Member at Transparent Cloud Computing Consortium, will introduce the big change toward the "connected-economy" in the digital age. He'll introduce and describe some leading-edge business cases from his original points of view, and discuss models & strategies in the connected-economy. Nowad...
"DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited to help the great team at @DevOpsSUMMIT and CloudEXPO tell the world how they can leverage this emerging disruptive trend."
For far too long technology teams have lived in siloes. Not only physical siloes, but cultural siloes pushed by competing objectives. This includes informational siloes where business users require one set of data and tech teams require different data. DevOps intends to bridge these gaps to make tech driven operations more aligned and efficient.
NanoVMs is the only production ready unikernel infrastructure solution on the market today. Unikernels prevent server intrusions by isolating applications to one virtual machine with no users, no shells and no way to run other programs on them. Unikernels run faster and are lighter than even docker containers.
CloudEXPO | DevOpsSUMMIT | DXWorldEXPO Silicon Valley 2019 will cover all of these tools, with the most comprehensive program and with 222 rockstar speakers throughout our industry presenting 22 Keynotes and General Sessions, 250 Breakout Sessions along 10 Tracks, as well as our signature Power Panels. Our Expo Floor will bring together the leading global 200 companies throughout the world of Cloud Computing, DevOps, IoT, Smart Cities, FinTech, Digital Transformation, and all they entail. As ...
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throughout enterprises of all sizes. We are offering early bird savings...
SUSE is a German-based, multinational, open-source software company that develops and sells Linux products to business customers. Founded in 1992, it was the first company to market Linux for the enterprise. Founded in 1992, SUSE is the world’s first provider of an Enterprise Linux distribution. Today, thousands of businesses worldwide rely on SUSE for their mission-critical computing and IT management needs.
The dream is universal: heuristic driven, global business operations without interruption so that nobody has to wake up at 4am to solve a problem. Building upon Nutanix Acropolis software defined storage, virtualization, and networking platform, Mark will demonstrate business lifecycle automation with freedom of choice and consumption models. Hybrid cloud applications and operations are controllable by the Nutanix Prism control plane with Calm automation, which can weave together the following: ...
Crosscode Panoptics Automated Enterprise Architecture Software. Application Discovery and Dependency Mapping. Automatically generate a powerful enterprise-wide map of your organization's IT assets down to the code level. Enterprise Impact Assessment. Automatically analyze the impact, to every asset in the enterprise down to the code level. Automated IT Governance Software. Create rules and alerts based on code level insights, including security issues, to automate governance. Enterpr...
Your job is mostly boring. Many of the IT operations tasks you perform on a day-to-day basis are repetitive and dull. Utilizing automation can improve your work life, automating away the drudgery and embracing the passion for technology that got you started in the first place. In this presentation, I'll talk about what automation is, and how to approach implementing it in the context of IT Operations. Ned will discuss keys to success in the long term and include practical real-world examples. Ge...