Search for Well Architected Advice
< All Topics
Print

Create dashboards

Creating Dashboards for Enhanced Workload Observability
Dashboards are a crucial tool for providing a human-centric view of telemetry data, offering an intuitive and visual representation of your workload’s health, performance, and business outcomes. Dashboards are meant to complement, not replace, alerting mechanisms. When crafted effectively, dashboards provide rapid insights into system behavior and help stakeholders understand real-time business impact, facilitating proactive management of workloads.

Provide a Visual Interface for Telemetry Data

Dashboards offer a visual interface for interpreting telemetry data collected from your workloads. By visualizing metrics, logs, and traces, dashboards provide an at-a-glance overview of the workload’s health, making it easier to understand trends, identify issues, and assess the overall state of the system.

Complement Alerts with In-Depth Insights

While alerts are designed to notify teams of specific anomalies or threshold breaches, dashboards provide comprehensive insights that go beyond what alerts can offer. Alerts indicate when action is needed, whereas dashboards enable deeper investigation by providing detailed views of key metrics, system performance, and historical trends. Dashboards and alerts together ensure that teams are both notified of issues and equipped to analyze their cause.

Craft Dashboards for System Health and Performance

Design dashboards that visualize metrics critical to system health and performance, such as CPU usage, memory utilization, request latency, error rates, and throughput. These metrics help teams monitor the state of the workload in real time and identify any areas that require immediate attention. Well-crafted dashboards make it easy to spot unusual behavior and take action before problems escalate.

Present Real-Time Business Outcomes

Dashboards should also include business outcome metrics to help stakeholders understand the impact of workload performance on the organization’s goals. Metrics like revenue per transaction, user engagement, or conversion rates can be presented alongside system performance metrics to provide a complete picture. This helps stakeholders see how technical metrics relate to business outcomes, enabling data-driven decision-making.

Enable Rapid Insights and Proactive Management

Dashboards provide rapid insights by aggregating and visualizing key data points in one place. This enables teams to spot issues, identify trends, and take proactive action to prevent small issues from becoming major incidents. Dashboards can also provide historical views, helping teams identify recurring patterns that may need attention.

Tailor Dashboards for Different Stakeholders

Create different dashboards tailored to specific stakeholders. For example:

  • Operations Teams: Dashboards focused on metrics like resource usage, system errors, and performance trends for day-to-day workload monitoring.
  • Business Stakeholders: Dashboards that present business metrics such as customer satisfaction, conversion rates, or financial performance, helping executives understand how the system is supporting business objectives.
  • Development Teams: Dashboards that include metrics like deployment success rates, code error metrics, and system response times to help developers assess the impact of their changes.

Supporting Questions

  • What metrics should be visualized in dashboards to provide insights into workload health and performance?
  • How do dashboards complement alerting mechanisms to ensure comprehensive workload observability?
  • How are dashboards tailored to meet the needs of different stakeholders?

Roles and Responsibilities

Monitoring Specialist
Responsibilities:

  • Design and create dashboards that visualize key workload metrics, enabling quick assessment of workload health and performance.
  • Ensure dashboards provide context to complement alerting mechanisms, helping teams investigate issues in more detail.

Operations Manager
Responsibilities:

  • Use dashboards to monitor workload health in real time and make informed decisions about capacity, scaling, and operational improvements.
  • Ensure dashboards include metrics relevant to system performance, reliability, and resource utilization.

Business Analyst
Responsibilities:

  • Collaborate with monitoring specialists to design dashboards that present business outcome metrics alongside technical metrics.
  • Use dashboards to track the impact of workload performance on key business objectives, providing insights to stakeholders.

Artifacts

  • System Health Dashboard: A dashboard that provides an at-a-glance overview of workload health, including metrics like latency, error rates, resource usage, and throughput.
  • Business Metrics Dashboard: A dashboard that visualizes metrics related to business outcomes, such as user engagement, revenue impact, and conversion rates.
  • Stakeholder-Specific Dashboard: A dashboard tailored to the specific needs of different stakeholders (e.g., operations teams, business executives, development teams).

Relevant AWS Tools

Dashboard and Visualization Tools

  • Amazon CloudWatch Dashboards: Creates custom dashboards to visualize key metrics, logs, and traces, providing a unified view of workload health and performance.
  • AWS QuickSight: Provides advanced data visualization capabilities, helping teams create dashboards that present both business and technical metrics in a way that is easy to understand.

Monitoring Tools

  • Amazon OpenSearch Service: Aggregates log data and visualizes it in dashboards, helping teams see trends and patterns in workload performance.
  • AWS X-Ray ServiceLens: Integrates with Amazon CloudWatch to create dashboards that visualize traces, metrics, and logs, providing a comprehensive view of workload observability.

Alerting Integration Tools

  • Amazon SNS (Simple Notification Service): Sends alerts that complement dashboard metrics, ensuring teams are notified of issues while using dashboards for in-depth analysis.
  • AWS Systems Manager OpsCenter: Integrates with CloudWatch to provide insights and dashboards that help manage operational issues, alongside automated notifications.
Table of Contents