Detecting Anomalies in Operational Metrics with Dynatrace and Amazon Lookout for Metrics

Detecting Anomalies in Operational Metrics with Dynatrace and Amazon Lookout for MetricsMore Info

Organizations across various industries rely on the analysis of operational metrics or key performance indicators (KPIs) to enhance their efficiency and effectiveness. These operational metrics serve as benchmarks for evaluating performance, comparing outcomes, and tracking essential data to drive improvements. For instance, metrics can be utilized to assess application performance, such as the average time taken to load a webpage for a user, or application availability, which measures the duration the application remains operational. One significant challenge many organizations face is the timely detection of anomalies in these operational metrics, which is vital for maintaining uninterrupted IT system operations.

Traditional methods for anomaly detection are primarily rule-based and often involve manually looking for data that falls outside of defined numerical ranges. For example, an alert may trigger when transaction rates dip below a specified threshold. However, this can lead to false alarms if the range is overly restrictive or result in missed anomalies if the range is too lenient. Moreover, these ranges are static and do not adapt to changing conditions such as time of day, day of the week, or seasonal variations. When anomalies occur, developers, analysts, and business leaders can spend weeks identifying the root cause before corrective action can be taken.

Amazon Lookout for Metrics leverages machine learning (ML) to automatically identify and diagnose anomalies without requiring prior ML expertise. With just a few clicks, users can connect Lookout for Metrics to well-known data repositories such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon Relational Database Service (Amazon RDS), as well as third-party SaaS applications including Salesforce, Dynatrace, Marketo, Zendesk, and ServiceNow via Amazon AppFlow. This functionality allows businesses to monitor the metrics that matter most to their operations.

This article illustrates how to integrate your IT operational infrastructure monitored by Dynatrace using Amazon AppFlow and establish an effective anomaly detection system across various metrics and dimensions through Lookout for Metrics. This solution enables continuous anomaly detection and the option to set up alerts for notifications when anomalies arise.

Lookout for Metrics works seamlessly with Dynatrace to uncover anomalies within your operational metrics. Once linked, Lookout for Metrics employs ML algorithms to monitor data and metrics for any irregularities or deviations from established norms. Dynatrace provides comprehensive monitoring of your entire infrastructure, including hosts, processes, and network traffic. Users can perform log monitoring and access crucial information such as total network traffic, CPU utilization, response times of processes, and more.

Amazon AppFlow is a fully managed service that facilitates data integration, allowing for the transfer of data between SaaS applications like Datadog, Salesforce, Marketo, and Slack, and AWS services such as Amazon S3 and Amazon Redshift. It offers features to transform, filter, and validate data, generating enriched and usable information in just a few straightforward steps.

Solution Overview

In this article, we will outline how to link with an environment monitored by Dynatrace to identify anomalies in operational metrics. We will also assess how application availability and performance (resource contention) were affected. The source data comprises a cluster of Amazon Elastic Compute Cloud (Amazon EC2) instances monitored by Dynatrace, with each instance equipped with Dynatrace OneAgent to collect telemetry data (CPU utilization, memory, network utilization, and disk I/O). Amazon AppFlow facilitates secure integration of SaaS applications like Dynatrace and automates data flows, providing options to configure and connect to these services directly from the AWS Management Console or via API. In this discussion, we will focus on connecting Dynatrace as the source and Lookout for Metrics as the target—both of which are supported within Amazon AppFlow.

The solution allows for the creation of an Amazon AppFlow data flow from Dynatrace to Lookout for Metrics. Subsequently, you can utilize Lookout for Metrics to identify any anomalies in the telemetry data. As an option, automated alerts can be sent to AWS Lambda functions, webhooks, or Amazon Simple Notification Service (Amazon SNS) topics.

The following are the high-level steps to implement the solution:

  1. Set up Amazon AppFlow integration with Dynatrace.
  2. Create an anomaly detector with Lookout for Metrics.
  3. Add a dataset to the detector and integrate Dynatrace metrics.
  4. Activate the detector.
  5. Create an alert.
  6. Review the detector and data flow status.
  7. Review and analyze any anomalies.

Setting Up Amazon AppFlow Integration with Dynatrace

To establish the data flow, follow these steps:

  1. In the Amazon AppFlow console, select “Create flow.”
  2. Input a name for the flow.
  3. Optionally, provide a description for the flow.
  4. In the Data encryption section, select or create an AWS Key Management Service (AWS KMS) key.
  5. Click “Next.”
  6. For Source name, select Dynatrace.
  7. Choose the Dynatrace Connection you previously created.
  8. For the Dynatrace object, select Problems (the only supported object as of this writing).
  9. For Destination name, select Amazon Lookout for Metrics.
  10. Generate an API token from the Dynatrace console for the API token field.
  11. Enter your Dynatrace portal URL for the Subdomain.
  12. Choose the AWS KMS key for Data encryption.
  13. Enter a name for the Connection.
  14. Click “Connect.”
  15. For Flow trigger, select “Run flow on schedule.”
  16. Choose “Minutes” for Repeats (alternatively, you may select hourly or daily).
  17. Set the trigger to repeat every 5 minutes, selecting a starting time and date.
  18. Note that Dynatrace requires a date range filter to be specified.
  19. For Field name, select Date range.
  20. Choose “is between” for Condition.
  21. Specify your start date for Criteria 1 and your end date for Criteria 2.
  22. Review your settings and click “Create flow.”

Creating an Anomaly Detector with Lookout for Metrics

To establish your anomaly detector, follow these steps:

  1. In the Lookout for Metrics console, select “Create detector.”
  2. Enter a name for the Detector.
  3. Optionally, provide a description.
  4. Select the interval for each analysis, ensuring it aligns with the flow’s interval.
  5. Create or select an existing AWS KMS key for Encryption.
  6. Click “Create.”

Adding a Dataset to the Detector and Integrating Dynatrace Metrics

Next, activate your anomaly detector by adding a dataset and integrating Dynatrace metrics.

  1. In the detector details, click “Add a dataset.”
  2. Enter a name for the data source.
  3. Optionally, provide a description.
  4. Select the appropriate timezone, which should match the one used in Amazon AppFlow.
  5. Choose Dynatrace as the Datasource.
  6. Select the flow you created from Amazon AppFlow.
  7. Choose a service role for Permissions.
  8. Click “Next.”

This integration allows you to successfully monitor operational metrics for any anomalies that may occur. For further insights into this topic, refer to this excellent resource.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *