Search for Well Architected Advice
< All Topics
Print

Review metrics at regular intervals

Reviewing metrics regularly is crucial for maintaining and improving the performance of your workloads in the cloud. By analyzing metrics, teams can identify performance bottlenecks, understand system behavior, and make informed decisions to optimize resources, ensuring efficient resource utilization.

Best Practices

  • Establish a Metrics Review Schedule: Set a regular cadence for reviewing your workload’s metrics, such as weekly or monthly. This proactive approach allows teams to spot trends and anomalies early, ensuring performance improvements are based on comprehensive data analysis.
  • Utilize Automated Monitoring Tools: Implement monitoring tools such as Amazon CloudWatch to automate metric collection and alerting. This ensures critical performance indicators are continuously tracked, enabling quick responses to any deviations that may impact workload performance.

Supporting Questions

  • Are the current metrics providing the insights needed to optimize performance effectively?

Roles and Responsibilities

  • Cloud Architect: Responsible for implementing and auditing the performance metrics framework, ensuring that the right metrics are being collected and evaluated.
  • DevOps Engineer: Tasked with setting up automated monitoring solutions and ensuring that alerts are configured for critical performance thresholds.

Artifacts

  • Metrics Dashboard: A visualization tool (like Amazon CloudWatch Dashboard) that displays essential performance metrics for stakeholders to review and analyze system performance at a glance.
  • Performance Review Report: A periodic document summarizing the findings from metrics reviews, outlining areas for improvement and actions taken to enhance workload performance.

Cloud Services

AWS

  • Amazon CloudWatch: A monitoring service that provides data and actionable insights to monitor application performance, optimize resource usage, and troubleshoot issues.
  • AWS Lambda: Allows you to run code in response to events, and can be used to automate the alerting process based on the metrics defined in Amazon CloudWatch.

Question: What process do you use to support more performance efficiency for your workload?
Pillar: Performance Efficiency (Code: PERF)

Table of Contents