Search for Well Architected Advice
< All Topics
Print

Use optimized hardware-based compute accelerators

Selecting optimized hardware for your compute resources can significantly enhance performance efficiency. By leveraging hardware accelerators, specific computational tasks can be executed more rapidly than using traditional CPU-based processing. This ensures resource optimization based on application requirements and boosts overall system performance.

Best Practices

  • Identify Workload Characteristics: To implement optimized hardware accelerators, first understand your workload’s specific computational needs. Profiling application performance can help identify bottlenecks where accelerators can provide gains in efficiency.
  • Evaluate Accelerator Options: Research and choose the right type of hardware accelerator based on your workload. Options like GPUs for parallel processing or custom ASICs for specialized tasks can enhance performance exponentially.

Supporting Questions

  • Are you monitoring performance metrics to identify potential areas for hardware acceleration?

Roles and Responsibilities

  • Architect: The architect is responsible for evaluating workload requirements and selecting appropriate compute resources, including hardware accelerators.
  • DevOps Engineer: The DevOps engineer implements and monitors the infrastructure, ensuring that the selected compute resources are effectively utilized for optimal performance.

Artifacts

  • Workload Performance Report: A document that outlines the performance metrics of current workloads, helping stakeholders understand where hardware accelerators can be implemented for improved efficiency.
  • Accelerator Configuration Guide: A guide detailing the configuration settings of selected hardware accelerators, tailored to specific workloads.

Cloud Services

AWS

  • Amazon EC2 with Inferentia: AWS’s EC2 instances powered by Inferentia accelerate machine learning inference workloads, allowing for efficient execution of deep learning models.
  • Amazon Elastic Inference: This service enables you to attach low-cost GPU-powered inference acceleration to CPU instances, providing flexibility in optimizing performance and cost.
  • Amazon Elastic Kubernetes Service (EKS): EKS supports the use of GPU instances for Kubernetes workloads, allowing for enhanced compute efficiency for data-intensive applications.

Question: How do you select and use compute resources in your workload?
Pillar: Performance Efficiency (Code: PERF)

Table of Contents