Search for Well Architected Advice
< All Topics
Print

Back up data only when difficult to recreate

Establishing effective data management practices is essential for achieving sustainability goals. By backing up only critical data that is difficult to recreate, organizations minimize unnecessary storage use, thereby reducing their carbon footprint and resource consumption associated with data center operations.

Best Practices

Implement Targeted Data Retention Policies

  • Define data retention requirements based on business value and compliance needs. This ensures that only valuable data is retained, minimizing storage usage.
  • Use tagging and categorization of data to facilitate the identification of data that is critical versus redundant, enabling more informed decisions on what to back up.
  • Regularly review and update data retention policies to adapt to changing business needs or technological advancements.

Utilize Storage Lifecycle Management

  • Configure automatic data tiering to move infrequently accessed data to lower-cost storage options, balancing performance and cost-effectiveness.
  • Implement lifecycle management policies that automatically delete data after a defined period or when it meets certain criteria to prevent unnecessary storage costs.
  • Leverage tools and services provided by cloud platforms to automate data lifecycle management tasks, reducing the potential for human error.

Evaluate Data Backup Necessity

  • Conduct assessments to determine which data is critical to business operations, guiding decisions on what data should be backed up.
  • Prioritize backup for data that is challenging to recreate or has significant business importance, and avoid backing up transient or low-value data.
  • Use versions and snapshots strategically, focusing on those essential for fulfilling service level agreements (SLAs) without unnecessarily backing up all data.

Monitor and Optimize Storage Usage

  • Regularly analyze storage utilization metrics to identify underused resources and optimize your environment accordingly.
  • Conduct audits to ensure compliance with data management policies, focusing on reducing the environmental impact of excess storage.
  • Use cloud-native tools to visualize data usage and conduct ongoing adjustments to align with sustainability goals.

Questions to ask your team

  • What criteria do you use to evaluate the business value of your data before deciding on backup strategies?
  • How do you ensure that only critical data is backed up, and what processes are in place to track this?
  • Are there established policies to regularly review and update your backup strategies based on data utility?
  • What methods do you use to identify data that is difficult to recreate, and how is this communicated to your team?
  • How do you measure the effectiveness of your backup strategies in terms of sustainability and storage efficiency?
  • Do you have a documented data lifecycle policy that guides the decisions on data retention and backup?

Who should be doing this?

Data Management Specialist

  • Assess data for business value and determine the necessity of backups.
  • Implement policies to identify and classify data based on its importance.
  • Establish guidelines for data retention and deletion practices.
  • Monitor data usage patterns to optimize storage resources.
  • Collaborate with stakeholders to ensure data management aligns with sustainability goals.

Cloud Architect

  • Design storage solutions that support efficient data management and align with sustainability objectives.
  • Evaluate and select storage technologies that minimize resource consumption.
  • Implement lifecycle policies to transition data to more cost-effective storage solutions.
  • Ensure compliance with data management policies across the infrastructure.
  • Analyze system performance to make informed decisions about data retention.

Compliance Officer

  • Ensure that data management practices meet regulatory standards and organizational policies.
  • Audit data management processes to identify areas for improvement.
  • Educate the organization on the importance of data minimization and sustainability.
  • Review backup strategies to ensure they only include data with business value.
  • Report on compliance and sustainability metrics related to data management.

What evidence shows this is happening in your organization?

  • Backup Policy for Only Critical Data: A documented policy outlining criteria for identifying crucial data sets that must be backed up, ensuring that short-lived or easily re-generated data is excluded to reduce storage resource usage.
  • Minimal Data Backup Checklist: A short, actionable checklist that helps teams confirm only difficult-to-recreate data is included in backup routines, preventing unnecessary replication of data with no use or value.
  • Critical Data Classification Guide: A guide detailing how to classify data based on its business importance. This helps determine which data is too costly or complex to regenerate, ensuring backup efforts focus only on essential data.

Cloud Services

AWS

  • Amazon S3: Offers object storage with a variety of storage classes to optimize costs and lifecycle policies to transition data to less expensive storage solutions.
  • AWS Backup: Centralizes and automates the backup of AWS resources to ensure you are only backing up critical data and managing storage costs efficiently.
  • AWS Data Lifecycle Manager: Automates the creation, retention, and deletion of EBS snapshots, allowing you to manage storage efficiently based on data lifecycle.

Azure

  • Azure Blob Storage: Provides scalable object storage with access tiers to optimize for different workloads, enabling you to move data to lower-cost access tiers as needed.
  • Azure Backup: Offers a reliable and cost-effective way to back up and restore your data, focusing on minimizing unnecessary backups to reduce storage use.
  • Azure Automation: Allows you to automate the management of resources, including deleting obsolete data based on predefined rules, to support efficient data lifecycle management.

Google Cloud Platform

  • Google Cloud Storage: Provides scalable storage options with multiple classes, allowing users to seamlessly transition less accessed data to lower-cost storage.
  • Google Cloud Backup and DR: Delivers a simple and reliable backup solution that enables you to back up essential data only, optimizing storage resource usage.
  • Google Cloud Functions: Can be used to automate data management tasks, including deleting redundant data and moving data between storage classes based on usage patterns.
Table of Contents