Configure and right-size compute resources

PostedDecember 20, 2024

UpdatedMarch 21, 2025

ByKevin McCaffrey

Configuring and right-sizing compute resources is essential for matching your workload’s performance requirements. This practice helps prevent under-utilization of resources, which can waste costs, as well as over-utilization, which can lead to performance degradation. Proper selection ensures optimal performance efficiency in the AWS cloud environment.

Best Practices

Implement Auto Scaling

Utilize Auto Scaling to adjust the number of compute instances based on demand, ensuring your resources match workload fluctuations.
Configure scaling policies that reflect your application’s usage patterns, such as CPU or memory utilization thresholds.
Continuously monitor performance and refine your scaling policies for optimal efficiency.

Use Compute Resource Monitoring Tools

Leverage AWS CloudWatch or third-party monitoring tools to track resource utilization and performance metrics.
Analyze metrics regularly to identify underutilized or overutilized resources and make adjustments.
Establish alerts for performance anomalies to proactively manage compute resources.

Select Appropriate Instance Types

Evaluate various instance types (e.g., general-purpose, compute-optimized, memory-optimized) based on your workload’s specific requirements.
Use AWS Instance Scheduler to regularly review and optimize instance configurations to align with workload needs.
Test and benchmark different instance types to determine which provides the best performance for your applications.

Utilize Spot Instances for Cost Efficiency

Consider using Amazon EC2 Spot Instances for workloads that are flexible or can tolerate interruptions, significantly reducing costs.
Implement fallback strategies to ensure workload continuity if Spot Instances become unavailable.
Regularly assess the cost savings versus performance impacts when using Spot Instances.

Review and Refine Resource Allocations Periodically

Conduct regular reviews of your compute resource allocations to ensure they are still aligned with current workload requirements.
Adjust configurations, sizes, and quantities of resources based on the latest performance metrics and application growth.
Implement a process for continuous improvement that incorporates feedback and new performance data into resource planning.

Questions to ask your team

How do you determine the performance requirements of your workload?
What metrics do you track to assess compute resource utilization?
Have you established a process for regularly reviewing and adjusting your compute resource configurations?
How often do you analyze the performance of different compute resource types for your application components?
Do you utilize automated tools to help with right-sizing your compute resources?
What methods do you use to test the impact of different configurations on performance?
Are you taking advantage of features such as Auto Scaling to optimize resource utilization?
How do you ensure that your compute resources match the traffic patterns and usage of your application?

Who should be doing this?

Cloud Architect

Assess workload performance requirements and usage patterns.
Select appropriate compute resources (e.g., EC2, Lambda) based on application needs.
Design scalable architectures that utilize various compute choices effectively.
Implement monitoring tools to track performance and resource utilization.
Make recommendations for resource adjustments based on monitoring data.

DevOps Engineer

Automate the deployment and configuration of compute resources.
Implement CI/CD pipelines that adapt to performance requirements.
Continuously monitor application performance and resource usage.
Optimize configurations for cost-effectiveness and performance efficiencies.
Collaborate with the Cloud Architect to align compute resources with workload needs.

Performance Analyst

Analyze workload performance metrics and usage patterns.
Provide insights on potential performance bottlenecks related to compute resources.
Conduct benchmarking tests to evaluate performance of different compute options.
Recommend strategies for optimizing compute resource usage.
Report on performance efficiency improvements and resource utilization trends.

What evidence shows this is happening in your organization?

Compute Resource Right-Sizing Checklist: A detailed checklist that guides teams through the process of evaluating and right-sizing compute resources based on workload performance requirements, ensuring resources are neither under- nor over-utilized.
Compute Selection Strategy Document: A comprehensive strategy document that outlines best practices for selecting compute resources tailored to specific application designs and usage patterns, with examples relevant to different workload types.
Performance Efficiency Dashboard: An interactive dashboard that monitors the performance metrics of compute resources in real-time, aiding in the assessment of whether the current resources are appropriately sized and configured.
Right-Sizing Playbook: A playbook that provides step-by-step guidance on how to analyze workload performance and configure compute resources accordingly, including case studies and success stories from the organization.
Resource Configuration Policy: A policy statement that establishes guidelines for configuring compute resources, focusing on maintaining optimal performance efficiency and documenting the rationale behind resource decisions.

Cloud Services

AWS

AWS Compute Optimizer: Analyzes your utilization metrics and provides recommendations for optimal instance types and sizes based on your workload needs.
Amazon EC2 Auto Scaling: Automatically adjusts the number of EC2 instances in response to changing demand to ensure optimal resource utilization.
AWS Lambda: Allows you to run code in response to events and automatically manages the compute resources for you, ensuring efficient compute usage.

Azure

Azure Advisor: Provides personalized best practices and recommendations to optimize your Azure resources for performance and cost.
Azure Autoscale: Automatically scales your applications to meet demand, maintaining performance and resource efficiency without manual intervention.
Azure Functions: Enables serverless computing, automatically scaling the resources based on demand and optimizing performance.

Google Cloud Platform

Google Cloud Recommender: Analyzes resource utilization patterns and provides recommendations for rightsizing compute instances based on performance requirements.
Google Cloud Autoscaler: Automatically scales instance groups based on load, ensuring that you have the right amount of resources for your workload.
Cloud Functions: Runs your code in a serverless environment, dynamically allocating resources based on demand and efficiency.

Question: How do you select and use compute resources in your workload?
Pillar: Performance Efficiency (Code: PERF)

Operational Excellence

Determine what your priorities are

Structure your organization to support your business outcomes

Organizational culture to support your business outcomes

Implement observability in your workload

Reduce defects, ease remediation, and improve flow into production

Mitigate deployment risks

Be ready to support a workload

Uilize workload observability

Understand the health of your operations

Manage workload and operations events

Evolve your operations

Security

Securely operate your workload

Manage identities for people and machines

Manage permissions for people and machines

Detect and investigate security events

Protect your network resources

Protect your compute resources

Classify your data

Protect your data at rest

Protect your data in transit

Anticipate, respond to, and recover from incidents

Incorporate and validate the security properties of applications throughout the design, development, and deployment lifecycle

Reliability

Manage service quotas and constraints

Plan your network topology

Design your workload service architecture

Design interactions in a distributed system to prevent failures

Design interactions in a distributed system to mitigate or withstand failures

Monitor workload resources

Design your workload to adapt to changes in demand

Implement change

Back up data

Fault isolation to protect your workload

Design your workload to withstand component failures

Test reliability

Plan for disaster recovery (DR)

Cost Optimization

Implement cloud financial management

Govern usage

Monitor your cost and usage

Decommission resources

Evaluate cost when you select services

Meet cost targets when you select resource type, size and number

Use pricing models to reduce cost

Plan for data transfer charges

Manage demand, and supply resources

Evaluate new services

Evaluate the cost of effort

Performance

Select the appropriate cloud resources and architecture patterns for your workload

Select and use compute resources in your workload

Store, manage, and access data in your workload

Select and configure networking resources in your workload

Support more performance efficiency for your workload

Sustainability

Select Regions for your workload

Align cloud resources to your demand

Take advantage of software and architecture patterns to support your sustainability goals

Take advantage of data management policies and patterns to support your sustainability goals

Select and use cloud hardware and services in your architecture to support your sustainability goals

Implement organizational processes support your sustainability goals