SYS-CON MEDIA Authors: Pat Romanski, Gary Arora, Zakia Bouachraoui, Yeshim Deniz, Liz McMillan

Blog Feed Post

Microservices Monitoring and Critical Incident ManagementHow Dynatrace and VictorOps Work Together

Wolfgang Beer, Technical Product Manager at Dynatrace, co-wrote this article.

Microservices can be game-changing if, as Martin Fowler says and Adam Drake explains, you have rapid provisioning, basic monitoring, and rapid deployment already in place. And when microservices meet containers, they can boost software engineering power to a whole new level. Together, they form architectures that act like living, breathing entities and are much more adaptable than in the past.

But an ensemble of microservices is far more complex to understand, let alone troubleshoot, when it comes to performance. Often hosted in modern cloud platforms such as AWS, Azure, or OpenStack, microservices are dynamically started and scaled depending on actual demands and traffic. As useful as this process is, managing availability, detecting errors, and identifying performance problems become especially demanding for DevOps teams.

These rapidly changing environments and dynamically scaling services mean that the right responders must be notified especially fast when things go wrong. And we need to separate out the critical, actionable alerts, versus shooting over a firehose full of noise.

Fortunately, Dynatrace and VictorOps have a few ideas for how to achieve this goal and give your DevOps teams some relief.

Dynatrace: full-stack monitoring with Artificial Intelligence

First, you need the right notifications. Dynatrace automatically detects all of those microservice dynamic infrastructure changes and learns how the entire service environment normally behaves. The system catches each individual transaction, from your application user action to your backend services and databases.

Then Dynatrace puts all that topological and transactional data into context and uses AI algorithms and analytics to detect the root-cause of complex incidents. What is interrelated? What are baselines versus anomalies that warrant alarms? Without that deep transactional and code-level visibility, it would be impossible for DevOps teams to pinpoint what’s causing errors, slowdowns or even outages.

The screenshot below shows how Dynatrace automatically identifies a CPU spike as the root-cause of web application slowdowns. The problem details card also shows the business impact the detected problem causes in terms of impacted real users that were using your web application in the moment of the problem and how many service calls into the backend were also affected.

https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-7... 768w, https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-1... 1024w, https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-8... 820w, https://victorops.com/wp-content/uploads/2017/06/Dynatrace-Screen-Shot-5... 510w" sizes="(max-width: 600px) 100vw, 600px" />

The attached ‘Visual resolution path’ shows the topological dependencies that were discovered while following the problem impacts.

Despite the fact that Dynatrace delivers such in-depth automated analysis about your environment, it’s mission critical to receive problem notifications through a reliable channel such as VictorOps.

Integrating Dynatrace with VictorOps adds more intelligence

Next, it’s time to add intelligent categorization, routing, and remediation instructions to the incoming notifications. Enter VictorOps. Whereas Dynatrace detects problems in real-time, VictorOps gives you the tools to create flexible on-call schedules and add intelligence to the incident lifecycle.

By integrating Dynatrace with VictorOps, you can now apply logic to help the right alerts get to the right people. Via the Incident Automation Engine, you can set up VictorOps to do things like:

  • Indicate the level of severity of each incoming notification, so you’re only alerted when something is critically wrong, separating the signal from the noise
  • Route the specific alert to the right responder so the expert closest to the problem can solve it faster
  • Deliver remediation steps alongside alerts, to assist with resolution

Together, Dynatrace and VictorOps speed time to resolution. The intelligence built into each system alleviates some of the stress, false alarms, and frequent burnout that DevOps and on-call teams experience.

Anonymous Dynatrace customers say this

“We have been using Dynatrace for over 5 years, and find it an indispensable tool during pre-release functional testing, pre-release load testing, and especially post-production troubleshooting of severity one issues. With a breadth of distributed platforms for key application environments, Dynatrace gives us near-real-time (within a matter of seconds) analysis of end-to-end transactions that are spread across multiple servers and multiple layers of the stack…”
(Source: Gartner peer insights)

“Dynatrace has been spectacular to work with. Technology-wise, we use it primarily for root-cause analysis and performance management from an infrastructure perspective, as opposed to APM. But we’re beginning to use it for more comprehensive APM now, and it’s proving very helpful. Relationship-wise, the Dynatrace team is one of the best I’ve worked with in my 20 years in IT. They view their customer relationship as a true partnership.” – IT Architect|
(Source: Gartner peer insights)

Bring more intelligence to microservices monitoring

Does this sound good to you? If you’re curious, take Dynatrace for a free 15-day test drive. See VictorOps in action. And if you already use both systems, follow these steps to install the VictorOps/Dynatrace integration. Then please give us feedback on your experience.

The post Microservices Monitoring and Critical Incident Management
How Dynatrace and VictorOps Work Together
appeared first on VictorOps.

Read the original blog entry...

More Stories By VictorOps Blog

VictorOps is making on-call suck less with the only collaborative alert management platform on the market.

With easy on-call scheduling management, a real-time incident timeline that gives you contextual relevance around your alerts and powerful reporting features that make post-mortems more effective, VictorOps helps your IT/DevOps team solve problems faster.

Latest Stories
While a hybrid cloud can ease that transition, designing and deploy that hybrid cloud still offers challenges for organizations concerned about lack of available cloud skillsets within their organization. Managed service providers offer a unique opportunity to fill those gaps and get organizations of all sizes on a hybrid cloud that meets their comfort level, while delivering enhanced benefits for cost, efficiency, agility, mobility, and elasticity.
Isomorphic Software is the global leader in high-end, web-based business applications. We develop, market, and support the SmartClient & Smart GWT HTML5/Ajax platform, combining the productivity and performance of traditional desktop software with the simplicity and reach of the open web. With staff in 10 timezones, Isomorphic provides a global network of services related to our technology, with offerings ranging from turnkey application development to SLA-backed enterprise support. Leadin...
DevOps has long focused on reinventing the SDLC (e.g. with CI/CD, ARA, pipeline automation etc.), while reinvention of IT Ops has lagged. However, new approaches like Site Reliability Engineering, Observability, Containerization, Operations Analytics, and ML/AI are driving a resurgence of IT Ops. In this session our expert panel will focus on how these new ideas are [putting the Ops back in DevOps orbringing modern IT Ops to DevOps].
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
Enterprises are striving to become digital businesses for differentiated innovation and customer-centricity. Traditionally, they focused on digitizing processes and paper workflow. To be a disruptor and compete against new players, they need to gain insight into business data and innovate at scale. Cloud and cognitive technologies can help them leverage hidden data in SAP/ERP systems to fuel their businesses to accelerate digital transformation success.
Concerns about security, downtime and latency, budgets, and general unfamiliarity with cloud technologies continue to create hesitation for many organizations that truly need to be developing a cloud strategy. Hybrid cloud solutions are helping to elevate those concerns by enabling the combination or orchestration of two or more platforms, including on-premise infrastructure, private clouds and/or third-party, public cloud services. This gives organizations more comfort to begin their digital tr...
Most organizations are awash today in data and IT systems, yet they're still struggling mightily to use these invaluable assets to meet the rising demand for new digital solutions and customer experiences that drive innovation and growth. What's lacking are potent and effective ways to rapidly combine together on-premises IT and the numerous commercial clouds that the average organization has in place today into effective new business solutions.
Keeping an application running at scale can be a daunting task. When do you need to add more capacity? Larger databases? Additional servers? These questions get harder as the complexity of your application grows. Microservice based architectures and cloud-based dynamic infrastructures are technologies that help you keep your application running with high availability, even during times of extreme scaling. But real cloud success, at scale, requires much more than a basic lift-and-shift migrati...
David Friend is the co-founder and CEO of Wasabi, the hot cloud storage company that delivers fast, low-cost, and reliable cloud storage. Prior to Wasabi, David co-founded Carbonite, one of the world's leading cloud backup companies. A successful tech entrepreneur for more than 30 years, David got his start at ARP Instruments, a manufacturer of synthesizers for rock bands, where he worked with leading musicians of the day like Stevie Wonder, Pete Townsend of The Who, and Led Zeppelin. David has ...
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Addteq is a leader in providing business solutions to Enterprise clients. Addteq has been in the business for more than 10 years. Through the use of DevOps automation, Addteq strives on creating innovative solutions to solve business processes. Clients depend on Addteq to modernize the software delivery process by providing Atlassian solutions, create custom add-ons, conduct training, offer hosting, perform DevOps services, and provide overall support services.
Contino is a global technical consultancy that helps highly-regulated enterprises transform faster, modernizing their way of working through DevOps and cloud computing. They focus on building capability and assisting our clients to in-source strategic technology capability so they get to market quickly and build their own innovation engine.
When applications are hosted on servers, they produce immense quantities of logging data. Quality engineers should verify that apps are producing log data that is existent, correct, consumable, and complete. Otherwise, apps in production are not easily monitored, have issues that are difficult to detect, and cannot be corrected quickly. Tom Chavez presents the four steps that quality engineers should include in every test plan for apps that produce log output or other machine data. Learn the ste...
Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...