Encryption, Pseudonymisation, and Anonymisation

Encryption, pseudonymisation, and anonymisation are three distinct data protection techniques that GDPR treats very differently. Encryption and pseudonymisation are security measures that apply to personal data — they reduce risk but do not change the data’s status as personal data subject to GDPR. True anonymisation removes personal data from GDPR’s scope entirely. The distinction matters enormously: organisations that treat pseudonymised data as ‘effectively anonymous’ and reduce their GDPR compliance obligations accordingly are making a compliance error with potentially serious consequences.

GDPR mentions pseudonymisation as a technique that ‘can reduce the risks to the data subjects concerned and help controllers and processors to meet their data protection obligations’ (Recital 28), and cites it as one of the specific measures under Article 32(1)(a). It also defines anonymisation by implication through the definition of personal data: information that cannot reasonably be used to identify an individual is not personal data and falls outside GDPR’s scope.

 

Encryption: Protecting Data in Transit and at Rest

Encryption transforms readable data into an unreadable format that can only be decrypted with the correct key. Under GDPR, encryption does not change the status of personal data — encrypted personal data is still personal data. However, encryption is relevant to GDPR in two specific ways: as a security measure under Article 32, and as a factor in breach risk assessment under Article 33, where strong encryption can reduce a breach from notifiable to non-notifiable because the data is unintelligible to any unauthorised recipient.

ENCRYPTION — IMPLEMENTATION REQUIREMENTS BY CONTEXT

ContextEncryption RequirementStandard
Data in transit (network transmission)All personal data transmitted over public networks must be encrypted; internal network transmission should also be encrypted for sensitive dataTLS 1.2 minimum; TLS 1.3 preferred; certificate management; HSTS for web applications; no unencrypted FTP or HTTP for personal data
Data at rest (storage)Sensitive and special category data must be encrypted at rest; all personal data should be encrypted at rest as best practiceAES-256 for storage encryption; database-level or field-level encryption for special category data; key management separate from data
Laptops and mobile devicesAll portable devices capable of containing personal data must use full-disk encryptionBitLocker / FileVault / equivalent; enforced via MDM; encryption status monitored and reported
Backup and archive mediaBackup media containing personal data must be encrypted; applies to tapes, portable drives, and cloud backupSame encryption standards as primary storage; key management must allow recovery; off-site media encrypted
Email containing personal dataSpecial category or high-risk personal data should not be sent via unencrypted email; use secure messaging or encrypted attachmentsS/MIME or PGP for email encryption; secure file transfer portal for bulk personal data; avoid email for sensitive data transfer as a policy
Cloud storagePersonal data stored in cloud services must use provider encryption with customer-managed keys for sensitive dataCustomer-managed keys (CMEK) for special category data; confirm provider encryption standards; verify data location and access controls
KEY IDEAEncryption is the most powerful supplementary measure in the transfer impact assessment context (post-Schrems II). Where a controller transfers personal data to a non-adequate country and holds the encryption keys such that the data importer in the destination country can only ever access encrypted ciphertext, the effectiveness of government access powers in that country is substantially reduced. This is the most effective technical supplementary measure identified by the EDPB for high-risk transfers.

 

Pseudonymisation: Risk Reduction, Not Exemption

Pseudonymisation replaces directly identifying information (name, email, ID number) with a pseudonym or code, while retaining the ability to re-identify the individual using a separately held key or mapping table. Article 4(5) defines pseudonymisation as ‘the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.’

Pseudonymised data remains personal data under GDPR, because re-identification is possible by any party that holds the mapping table. The pseudonymisation benefit is that it reduces the risk associated with the data: if the pseudonymised dataset is exfiltrated without the mapping table, it is substantially less harmful than the loss of fully identified data. This has practical implications for breach risk assessment, DPIAs, and security measure design.

PSEUDONYMISATION — GDPR BENEFITS AND LIMITATIONS

ContextPseudonymisation BenefitWhat It Does NOT Change
Breach risk assessment (Art. 33)A breach involving only pseudonymised data — where the mapping table is not also compromised — may be assessed as lower risk, potentially below the DPA notification thresholdStill a personal data breach; still requires breach register entry; still requires risk assessment; mapping table breach would be high risk
DPIA risk mitigation (Art. 35)Pseudonymisation is explicitly cited in GDPR as a risk-reduction measure; implementing it can reduce residual risk in a DPIADoes not eliminate the need for a DPIA; does not change the legal basis requirement; does not change data subject rights obligations
Article 32 security measuresPseudonymisation of sensitive fields reduces impact of a data exfiltration event; reduces the ‘harm if breached’ dimension of riskDoes not replace other security measures; pseudonymisation alone is not sufficient security for high-risk processing
Research and analytics (Art. 89)Pseudonymisation is the preferred technique for scientific research and statistical processing under Art. 89 safeguardsResearch must still have a lawful basis; data subjects retain rights unless specific research exemptions apply
Cross-border transfersPseudonymised data transferred to non-EEA countries with key held in EEA reduces risk; can be cited as supplementary TIA measureTransfer mechanism still required; does not substitute for SCCs or adequacy

 

Anonymisation: Outside GDPR’s Scope, But Hard to Achieve

Truly anonymous data — data from which all identifying information has been irreversibly removed such that no individual can be identified, even indirectly or in combination with other datasets — falls outside GDPR’s definition of personal data and is not subject to GDPR’s requirements. This is a powerful compliance benefit: anonymised datasets can be retained indefinitely, shared without restriction, and used for any purpose without a lawful basis.

The problem is that true anonymisation is far harder to achieve than organisations typically assume. The EDPB’s Opinion 05/2014 on anonymisation techniques established the three-part test for genuine anonymisation: the data must be resistant to singling out (identifying a specific individual), linkability (linking records relating to the same individual), and inference (deducing information about an individual from the dataset). Most ‘anonymised’ datasets fail at least one of these tests, particularly in the age of large-scale data combination and re-identification research.

ANONYMISATION TECHNIQUE COMPARISON

TechniqueHow It WorksRe-identification RiskGenuine Anonymisation?
Name and direct identifier removalRemoving name, email, ID, phone from datasetVery high — individuals can often be re-identified from demographic combinationsRarely sufficient alone; residual quasi-identifiers (postcode, age, gender, employer) enable re-identification
Data generalisation / aggregationReplacing precise values with ranges (age 35 → 30–40; postcode SW1A 2AA → SW1)Moderate to low depending on implementation; k-anonymity and l-diversity frameworks manage this systematicallyCan be sufficient if implemented rigorously with k-anonymity ≥ 5 and sensitivity testing; requires expert review
Data masking / redactionReplacing values with asterisks, blanks, or placeholder values for output purposesHigh if original data retained; masking is presentation-layer only; not genuine anonymisationNot anonymisation — original data still exists; useful for access control, not for scope removal
Noise addition / differential privacyAdding statistical noise to dataset so individual values cannot be determined with certaintyLow if implemented correctly with sufficient noise; formal differential privacy provides mathematical guaranteeCan achieve genuine anonymisation for aggregate outputs; requires statistical expertise; not suitable for record-level sharing
Synthetic data generationReplacing real data with statistically similar generated data that preserves distributions but contains no real individuals’ recordsVery low if properly generated; no real individual represented in the datasetCan achieve genuine anonymisation; increasingly used for ML training and testing; requires validation that no real data is reproduced
IMPORTANTThe most common anonymisation error in practice is treating pseudonymisation as anonymisation. An organisation that replaces customer IDs with random tokens, retains the token mapping table internally, and then treats the resulting dataset as ‘anonymised data outside GDPR’ has not anonymised the data — it has pseudonymised it. The data is still personal data, GDPR still applies, and any compliance decisions made on the assumption of anonymisation are incorrect. Before treating any dataset as anonymous, apply the EDPB’s three-part test (singling out, linkability, inference) and document the analysis.

 

Practical Decision Framework

ENCRYPTION / PSEUDONYMISATION / ANONYMISATION — SELECTION GUIDE

ObjectiveRecommended TechniqueKey Consideration
Protect data if intercepted in transit or stolen at restEncryption (at rest and in transit)GDPR still applies; breach risk reduced; key management is critical
Reduce risk of a data breach eventPseudonymisation of sensitive fieldsGDPR still applies; re-identification possible via mapping table; mapping table must be secured separately
Enable data to be shared externally without GDPR obligationsGenuine anonymisation (aggregation, differential privacy, synthetic data)Apply three-part EDPB test; document analysis; if any doubt, treat as personal data
Use production data safely in development / testing environmentsPseudonymisation of all personal data before copying to test environmentTest environment should never contain identifiable production personal data; pseudonymisation is minimum standard
Produce analytics and statistics without retaining identifiable dataAggregation / differential privacy for outputs; pseudonymisation of underlying dataOutputs may be non-personal data; underlying processing still subject to GDPR; document the anonymisation analysis
Transfer data to non-EEA country and reduce TIA riskEncryption with EEA-held keys (data exporter retains keys; importer processes ciphertext only)Most effective supplementary measure per EDPB; importer must not hold decryption keys; verify technically enforceable