SYS-CON MEDIA Authors: Elizabeth White, Liz McMillan, William Schmarzo, Yeshim Deniz, Jason Bloomberg

Blog Feed Post

Microservices Monitoring and Critical Incident ManagementHow Dynatrace and VictorOps Work Together

Wolfgang Beer, Technical Product Manager at Dynatrace, co-wrote this article.

Microservices can be game-changing if, as Martin Fowler says and Adam Drake explains, you have rapid provisioning, basic monitoring, and rapid deployment already in place. And when microservices meet containers, they can boost software engineering power to a whole new level. Together, they form architectures that act like living, breathing entities and are much more adaptable than in the past.

But an ensemble of microservices is far more complex to understand, let alone troubleshoot, when it comes to performance. Often hosted in modern cloud platforms such as AWS, Azure, or OpenStack, microservices are dynamically started and scaled depending on actual demands and traffic. As useful as this process is, managing availability, detecting errors, and identifying performance problems become especially demanding for DevOps teams.

These rapidly changing environments and dynamically scaling services mean that the right responders must be notified especially fast when things go wrong. And we need to separate out the critical, actionable alerts, versus shooting over a firehose full of noise.

Fortunately, Dynatrace and VictorOps have a few ideas for how to achieve this goal and give your DevOps teams some relief.

Dynatrace: full-stack monitoring with Artificial Intelligence

First, you need the right notifications. Dynatrace automatically detects all of those microservice dynamic infrastructure changes and learns how the entire service environment normally behaves. The system catches each individual transaction, from your application user action to your backend services and databases.

Then Dynatrace puts all that topological and transactional data into context and uses AI algorithms and analytics to detect the root-cause of complex incidents. What is interrelated? What are baselines versus anomalies that warrant alarms? Without that deep transactional and code-level visibility, it would be impossible for DevOps teams to pinpoint what’s causing errors, slowdowns or even outages.

The screenshot below shows how Dynatrace automatically identifies a CPU spike as the root-cause of web application slowdowns. The problem details card also shows the business impact the detected problem causes in terms of impacted real users that were using your web application in the moment of the problem and how many service calls into the backend were also affected.

https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-7... 768w, https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-1... 1024w, https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-8... 820w, https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-5... 510w" sizes="(max-width: 600px) 100vw, 600px" />

The attached ‘Visual resolution path’ shows the topological dependencies that were discovered while following the problem impacts.

Despite the fact that Dynatrace delivers such in-depth automated analysis about your environment, it’s mission critical to receive problem notifications through a reliable channel such as VictorOps.

Integrating Dynatrace with VictorOps adds more intelligence

Next, it’s time to add intelligent categorization, routing, and remediation instructions to the incoming notifications. Enter VictorOps. Whereas Dynatrace detects problems in real-time, VictorOps gives you the tools to create flexible on-call schedules and add intelligence to the incident lifecycle.

By integrating Dynatrace with VictorOps, you can now apply logic to help the right alerts get to the right people. Via the Incident Automation Engine, you can set up VictorOps to do things like:

  • Indicate the level of severity of each incoming notification, so you’re only alerted when something is critically wrong, separating the signal from the noise
  • Route the specific alert to the right responder so the expert closest to the problem can solve it faster
  • Deliver remediation steps alongside alerts, to assist with resolution

Together, Dynatrace and VictorOps speed time to resolution. The intelligence built into each system alleviates some of the stress, false alarms, and frequent burnout that DevOps and on-call teams experience.

Anonymous Dynatrace customers say this

“We have been using Dynatrace for over 5 years, and find it an indispensable tool during pre-release functional testing, pre-release load testing, and especially post-production troubleshooting of severity one issues. With a breadth of distributed platforms for key application environments, Dynatrace gives us near-real-time (within a matter of seconds) analysis of end-to-end transactions that are spread across multiple servers and multiple layers of the stack…”
(Source: Gartner peer insights)

“Dynatrace has been spectacular to work with. Technology-wise, we use it primarily for root-cause analysis and performance management from an infrastructure perspective, as opposed to APM. But we’re beginning to use it for more comprehensive APM now, and it’s proving very helpful. Relationship-wise, the Dynatrace team is one of the best I’ve worked with in my 20 years in IT. They view their customer relationship as a true partnership.” – IT Architect|
(Source: Gartner peer insights)

Bring more intelligence to microservices monitoring

Does this sound good to you? If you’re curious, take Dynatrace for a free 15-day test drive. See VictorOps in action. And if you already use both systems, follow these steps to install the VictorOps/Dynatrace integration. Then please give us feedback on your experience.

The post Microservices Monitoring and Critical Incident Management
How Dynatrace and VictorOps Work Together
appeared first on VictorOps.

Read the original blog entry...

More Stories By VictorOps Blog

VictorOps is making on-call suck less with the only collaborative alert management platform on the market.

With easy on-call scheduling management, a real-time incident timeline that gives you contextual relevance around your alerts and powerful reporting features that make post-mortems more effective, VictorOps helps your IT/DevOps team solve problems faster.

Latest Stories
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists examined how DevOps helps to meet the de...
Fact: storage performance problems have only gotten more complicated, as applications not only have become largely virtualized, but also have moved to cloud-based infrastructures. Storage performance in virtualized environments isn’t just about IOPS anymore. Instead, you need to guarantee performance for individual VMs, helping applications maintain performance as the number of VMs continues to go up in real time. In his session at Cloud Expo, Dhiraj Sehgal, Product and Marketing at Tintri, sha...
According to Forrester Research, every business will become either a digital predator or digital prey by 2020. To avoid demise, organizations must rapidly create new sources of value in their end-to-end customer experiences. True digital predators also must break down information and process silos and extend digital transformation initiatives to empower employees with the digital resources needed to win, serve, and retain customers.
In his session at 19th Cloud Expo, Claude Remillard, Principal Program Manager in Developer Division at Microsoft, contrasted how his team used config as code and immutable patterns for continuous delivery of microservices and apps to the cloud. He showed how the immutable patterns helps developers do away with most of the complexity of config as code-enabling scenarios such as rollback, zero downtime upgrades with far greater simplicity. He also demoed building immutable pipelines in the cloud ...
More and more companies are looking to microservices as an architectural pattern for breaking apart applications into more manageable pieces so that agile teams can deliver new features quicker and more effectively. What this pattern has done more than anything to date is spark organizational transformations, setting the foundation for future application development. In practice, however, there are a number of considerations to make that go beyond simply “build, ship, and run,” which changes how...
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, will provide an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life ...
Smart Cities are here to stay, but for their promise to be delivered, the data they produce must not be put in new siloes. In his session at @ThingsExpo, Mathias Herberts, Co-founder and CTO of Cityzen Data, discussed the best practices that will ensure a successful smart city journey.
A look across the tech landscape at the disruptive technologies that are increasing in prominence and speculate as to which will be most impactful for communications – namely, AI and Cloud Computing. In his session at 20th Cloud Expo, Curtis Peterson, VP of Operations at RingCentral, highlighted the current challenges of these transformative technologies and shared strategies for preparing your organization for these changes. This “view from the top” outlined the latest trends and developments i...
When you focus on a journey from up-close, you look at your own technical and cultural history and how you changed it for the benefit of the customer. This was our starting point: too many integration issues, 13 SWP days and very long cycles. It was evident that in this fast-paced industry we could no longer afford this reality. We needed something that would take us beyond reducing the development lifecycles, CI and Agile methodologies. We made a fundamental difference, even changed our culture...
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
LogRocket helps product teams develop better experiences for users by recording videos of user sessions with logs and network data. It identifies UX problems and reveals the root cause of every bug. LogRocket presents impactful errors on a website, and how to reproduce it. With LogRocket, users can replay problems.
@CloudEXPO and @ExpoDX, two of the most influential technology events in the world, have hosted hundreds of sponsors and exhibitors since our launch 10 years ago. @CloudEXPO and @ExpoDX New York and Silicon Valley provide a full year of face-to-face marketing opportunities for your company. Each sponsorship and exhibit package comes with pre and post-show marketing programs. By sponsoring and exhibiting in New York and Silicon Valley, you reach a full complement of decision makers and buyers in ...
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
Data Theorem is a leading provider of modern application security. Its core mission is to analyze and secure any modern application anytime, anywhere. The Data Theorem Analyzer Engine continuously scans APIs and mobile applications in search of security flaws and data privacy gaps. Data Theorem products help organizations build safer applications that maximize data security and brand protection. The company has detected more than 300 million application eavesdropping incidents and currently secu...
Rafay enables developers to automate the distribution, operations, cross-region scaling and lifecycle management of containerized microservices across public and private clouds, and service provider networks. Rafay's platform is built around foundational elements that together deliver an optimal abstraction layer across disparate infrastructure, making it easy for developers to scale and operate applications across any number of locations or regions. Consumed as a service, Rafay's platform elimi...