Search for Well Architected Advice
< All Topics
Print

Use purpose-built data store that best support your data access and storage requirements

Selecting the right data management solution is crucial for optimal performance in cloud workloads. Different data types and access patterns necessitate tailored storage solutions to achieve the required efficiency, throughput, and scalability.

Best Practices

Select Appropriate Data Storage Solutions

  • Assess data types and choose the right storage mechanisms (e.g., Amazon S3 for object storage, Amazon EFS for file storage, or Amazon EBS for block storage).
  • Understand your performance requirements by analyzing access patterns, such as read-heavy or write-heavy workloads, and choose storage types that match these needs.
  • Utilize caching solutions (like Amazon ElastiCache or CloudFront) to improve data retrieval times for frequently accessed data.
  • Evaluate the durability and availability needs of your data to determine redundancy options—use features like AWS Backup or multi-AZ deployments where necessary.
  • Regularly monitor performance metrics via AWS CloudWatch to ensure your data storage solutions are meeting expected performance thresholds and make adjustments as needed.

Questions to ask your team

  • What types of data are you storing in your workload (block, file, object)?
  • What are the access patterns for your data (random or sequential)?
  • How frequently is the data accessed (online, offline, archival)?
  • What are the durability and availability requirements for your data?
  • Are there specific performance metrics (latency, throughput) that your application needs to meet?
  • Have you analyzed the data characteristics (size, shareability, cache size) to inform your storage decisions?
  • How do you manage data updates (WORM, dynamic)?
  • What purpose-built data stores are currently in use, and do they align with your workload’s requirements?
  • Have you evaluated any potential bottlenecks in your data access and storage processes?

Who should be doing this?

Data Architect

  • Analyze data characteristics to determine appropriate storage solutions.
  • Design data management strategies that align with access patterns and data types.
  • Evaluate and select purpose-built data stores (storage or database) that optimize performance and efficiency.
  • Collaborate with developers and operations teams to implement data storage solutions.

DevOps Engineer

  • Monitor and maintain the performance of data storage systems.
  • Implement caching and performance optimization techniques.
  • Automate data management processes for efficiency.
  • Ensure the infrastructure supports selected purpose-built data stores.

Database Administrator

  • Maintain, back up, and restore data in purpose-built database solutions.
  • Optimize database performance based on access patterns and usage.
  • Manage data integrity, durability, and availability constraints.
  • Provide support for query optimization and indexing strategies.

Application Developer

  • Develop applications that efficiently interact with chosen data stores.
  • Implement appropriate data retrieval and storage methods based on access patterns.
  • Ensure proper handling of data consistency and transactions.
  • Collaborate with the data architect to align application design with data management strategies.

What evidence shows this is happening in your organization?

  • Data Store Selection Guide: A comprehensive guide that outlines criteria for selecting purpose-built data stores based on data characteristics such as access patterns, size, and frequency. It includes flowcharts for decision-making processes.
  • Performance Efficiency Checklist: A checklist that helps teams assess their workload’s data storage requirements and determine the appropriate purpose-built data stores. It includes key questions about data types, access patterns, and durability needs.
  • Performance Metrics Dashboard: A dashboard that visualizes performance metrics of various data stores in use, showcasing throughput, latency, and access patterns. This helps teams monitor efficiency and make data-driven adjustments.
  • Data Management Strategy Document: A formal document that outlines the organization’s overall data management strategy, incorporating the selection of purpose-built stores and access methodologies to maximize performance efficiency.
  • Implementation Playbook for Purpose-Built Data Stores: A playbook that details step-by-step procedures for implementing various purpose-built data stores within workloads, including best practices and common pitfalls to avoid.

Cloud Services

AWS

  • Amazon S3: A scalable object storage service for online and archival storage, ideal for unstructured data with varying access patterns.
  • Amazon EFS: A fully managed file storage service that is simple to set up and scale for use with AWS Cloud services and on-premises resources.
  • Amazon RDS: A managed relational database service that helps in organizing and retrieving data efficiently for transactional workloads.
  • Amazon DynamoDB: A fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.

Azure

  • Azure Blob Storage: A scalable object storage service for massive amounts of unstructured data, supporting a range of access patterns.
  • Azure Files: A managed file share service in the cloud that supports SMB protocol for applications that require file storage.
  • Azure SQL Database: A relational database service that supports and scales transactional workloads while providing high availability.
  • Azure Cosmos DB: A globally distributed, multi-model database service that provides low latency access, designed for high-throughput applications.

Google Cloud Platform

  • Google Cloud Storage: An object storage service that offers high availability and durability for a wide variety of data types.
  • Google Filestore: A managed file storage service that provides file storage for applications requiring shared file systems.
  • Cloud SQL: A fully managed relational database service for MySQL and PostgreSQL, supporting transactional workloads with high reliability.
  • Google Cloud Firestore: A flexible, scalable NoSQL cloud database to support mobile, web, and server development for different access patterns.

Question: How do you store, manage, and access data in your workload?
Pillar: Performance Efficiency (Code: PERF)

Table of Contents