Search for Well Architected Advice
< All Topics
Print

Identify the data within your workload

Data classification is essential for implementing appropriate protection and retention controls based on the criticality and sensitivity of the data processed. Understanding data classification helps organizations manage risk and compliance effectively.

Best Practices

Implement Comprehensive Data Discovery and Classification

  • Conduct regular audits to identify and categorize data based on sensitivity and compliance requirements, ensuring that all data types are accounted for.
  • Utilize automated tools for data discovery to efficiently classify data at scale, reducing manual effort and the potential for human error.
  • Engage stakeholders from various departments (such as legal, compliance, and IT) to establish clear data ownership and understand business processes associated with the data.
  • Document the entire classification process, including definitions for classification levels (e.g., public, confidential, sensitive) and the rationale behind the classifications.
  • Regularly review and update data classifications based on changes in business processes, regulatory requirements, or the data itself to maintain accuracy and relevance.

Questions to ask your team

  • What types of data does your workload process (e.g., personal data, financial data, health data)?
  • How is the data categorized based on its sensitivity and criticality?
  • Who is the designated data owner for each type of data?
  • Where is the data stored, and how is access to it managed?
  • What legal and compliance requirements apply to the data you handle?
  • What specific data protection controls have been implemented for different data classifications?
  • How often do you review and update your data classification policies and practices?

Who should be doing this?

Data Owner

  • Identify and classify data within the workload.
  • Determine the sensitivity and criticality of the data.
  • Specify the appropriate protection and retention controls for the data.
  • Ensure compliance with applicable legal and regulatory requirements.

Data Steward

  • Assist in identifying and categorizing data types.
  • Document data classification criteria and processes.
  • Monitor data handling practices to ensure compliance with classification standards.
  • Provide training and guidance to team members on data classification.

Compliance Officer

  • Identify applicable legal and compliance requirements for data classification.
  • Ensure that data classification aligns with organizational policies and regulations.
  • Conduct audits to verify compliance with data protection regulations.
  • Provide recommendations for data control measures.

IT Security Specialist

  • Evaluate and implement security controls based on data classification.
  • Conduct risk assessments related to data protection.
  • Monitor data access and usage to prevent unauthorized access.
  • Collaborate with data owners to ensure data is properly protected.

Project Manager

  • Oversee the data classification process within the project.
  • Coordinate between data owners, stewards, and security specialists.
  • Ensure timelines and milestones for the classification project are met.
  • Facilitate communication regarding data classification requirements and updates.

What evidence shows this is happening in your organization?

  • Data Classification Policy: A formal document outlining the criteria for classifying data based on sensitivity and criticality, including definitions for data categories and the procedures for handling each category.
  • Data Inventory Checklist: A checklist designed to help teams identify and document all data types within a workload, including sensitive and critical data, along with their storage locations and data owners.
  • Data Classification Framework: A diagrammatic representation of the data classification process within the organization, illustrating how data is categorized, stored, and managed according to defined policies.
  • Legal and Compliance Requirements Matrix: A matrix that maps data classifications to specific legal and compliance requirements, helping teams understand necessary controls and obligations based on data type.
  • Data Protection Strategy Guide: A guide that outlines strategies and best practices for protecting classified data, detailing the security measures and controls that should be applied based on the classification.

Cloud Services

AWS

  • Amazon Macie: Uses machine learning to automatically discover, classify, and protect sensitive data stored in Amazon S3.
  • AWS Glue Data Catalog: Provides a centralized metadata repository to store and manage data definitions and their classifications for data stored in various sources.
  • AWS Identity and Access Management (IAM): Enables you to control user access to resources in AWS based on the sensitive nature of the data.

Azure

  • Azure Information Protection: Helps organizations classify and protect data based on its sensitivity, applying labels automatically or manually.
  • Azure Purview: Provides unified data governance to enable the classification, mapping, and management of data assets across Azure and on-premises.
  • Azure Active Directory: Enables you to manage user access and security policies in alignment with data sensitivity and compliance requirements.

Google Cloud Platform

  • Google Cloud Data Loss Prevention (DLP): Provides tools to automatically classify, discover, and protect sensitive data across various Google Cloud data stores.
  • Google Cloud Asset Inventory: Helps you track your assets and their metadata, including their classifications, across your Google Cloud resources.
  • Google Cloud IAM: Enables you to manage access to your Google Cloud resources based on data classification and user roles.

Question: How do you classify your data?
Pillar: Security (Code: SEC)

Table of Contents