Data Breach Is Not One Thing — And Neither Is Data Security

Core thesis

Privacy law layered three semantically loaded terms on top of each other — data security, data breach, and personal data breach — and then attached a control list to the foundation term that was already overloaded. The result is a regulation whose definitions are recursive, whose controls sit at four different abstraction layers, and whose practical effect is to condition professionals into checklist behaviour rather than threat-driven reasoning. TLCTC dissolves the confusion by separating cause (threat cluster) from outcome (Data Risk Event) from consequence (Business Risk Event).

For many years, the phrase data breach carried a very specific picture in the minds of security practitioners: an attacker got access to information they were not supposed to see. A database was dumped. Customer records were exfiltrated. Credentials surfaced on a dark web forum.

In that everyday cybersecurity sense, “data breach” mostly meant one thing:

+ [DRE: C]

Loss of Confidentiality.

Then privacy law arrived, and it did not merely rename the concept. It widened it, redefined it in terms of another overloaded word, and bolted a control list onto the back of that definition. The combined effect produced a regulatory edifice that looks rigorous but, on closer inspection, sits on shifting sand.

This post does three things:

it separates the three overloaded words that collide in GDPR-style privacy regulation;
it shows how that collision conditions professionals into checklist behaviour;
it offers the TLCTC alternative — cause first, outcome second, consequence third.

The first word: data breach (the cybersecurity term)

In operational cybersecurity language, “data breach” became strongly associated with confidentiality failures.

Typical examples:

an attacker extracts a customer database;
a cloud bucket is exposed to the internet;
credentials are leaked;
an employee sends sensitive records to the wrong recipient;
a malicious insider copies files;
a web application vulnerability allows unauthorised data access.

In TLCTC notation:

+ [DRE: C]

Data Risk Event: Loss of Confidentiality.

A worked example:

#2 + [DRE: C]

A server-side implementation flaw, such as SQL injection, is exploited and customer data is read. The threat cluster is #2 Exploiting Server. The outcome is Loss of Confidentiality.

Another:

#9 + [DRE: C] → #4 → #1 + [DRE: C]

A phishing message manipulates a human into revealing credentials. The credential disclosure is a confidentiality event. The later use of those credentials is #4 Identity Theft, and the export of customer records through legitimate application functions is #1 Abuse of Functions, again producing Loss of Confidentiality.

In both examples, “data breach” may be the business or legal label. But TLCTC does not classify the label. It classifies the cause.

This is the heart of the framework: TLCTC is cause-side. It classifies how compromise happens, not what happens afterwards. Outcomes such as “data breach,” “service outage,” or “ransomware impact” are recorded separately as Data Risk Events or Business Risk Events.

So far, so clean.

Then the law walked in.

The second word: personal data breach (the legal term)

Under GDPR Article 4(12), a personal data breach means:

“a breach of security leading to the accidental or unlawful destruction, loss, alteration, unauthorised disclosure of, or access to, personal data transmitted, stored or otherwise processed.”

The UK ICO and EDPB guidance treat this the same way: a personal data breach is a security incident affecting the confidentiality, integrity, or availability of personal data.

This is a substantively broader concept than the cybersecurity sense of “data breach.” It is no longer about disclosure alone. It covers destruction, loss, alteration, and unavailability — and it explicitly includes accidental events, not only hostile attackers.

Privacy/legal teams are therefore not only concerned with:

Who saw the data?

They are also concerned with:

Was the data changed?

Was the data destroyed?

Was access to the data lost?

Was the data rendered unusable?

Did the controller lose control over the personal data?

A privacy breach can be a confidentiality event, but it can also be an integrity event, an availability event, or an accessibility event.

So the legal term widens the scope. Good — that reflects real harm to data subjects.

But notice how the law defines it: “a breach of security leading to…”

The definition is recursive. The breach is defined in terms of another word — “security” — which the law itself has already loaded with a separate meaning. And that is where the third overloaded word enters.

The third word: data security (the Article 32 term)

GDPR Article 32 is titled “Security of processing.” It is the obligation to put in place “appropriate technical and organisational measures” to protect personal data. It lists, among others:

pseudonymisation and encryption of personal data;
the ability to ensure the ongoing confidentiality, integrity, availability, and resilience of processing systems and services;
the ability to restore the availability of and access to personal data in a timely manner in the event of a physical or technical incident;
a process for regularly testing, assessing, and evaluating the effectiveness of measures.

That paragraph contains the word “security” as a legal obligation — an Article 32 compliance scope. It is not the same word as “security” in the technical CIA-triad sense, even though they share spelling.

So now we have three meanings circulating under similar names:

Term	What it really means
Data security (technical)	The property of preventing unauthorised disclosure, alteration, destruction, or denial-of-use of data
Data security (Art. 32)	The legal obligation to deploy appropriate TOMs to protect personal data
Data breach (cybersecurity)	Usually `[DRE: C]` — unauthorised disclosure
Personal data breach (Art. 4(12))	Any breach of security affecting C/I/Av of personal data — including accidental

The legal definition of personal data breach therefore rests on a term — “security” — which the same regulation simultaneously uses as a compliance scope. That is what makes the foundation recursive: you cannot say what a personal data breach is without first deciding which “security” you mean — the technical property, or the legal obligation.

For lawyers, that ambiguity is tolerable. For security architects, it is corrosive. Because once “security” means both “a property of the system” and “a checklist of obligations,” practitioners stop asking which threat cluster does this control defend against? and start asking can I produce evidence that we did the listed item?

TLCTC translation: legal wording into Data Risk Events

TLCTC restores precision on the outcome side by separating four Data Risk Event subtypes:

Legal/privacy wording	TLCTC Data Risk Event
Unauthorised disclosure of personal data	`+ [DRE: C]` Loss of Confidentiality
Unauthorised access to personal data	`+ [DRE: C]` Loss of Confidentiality
Alteration of personal data	`+ [DRE: I]` Loss of Integrity
Destruction of personal data	`+ [DRE: Av]` Loss of Availability
Loss of personal data	usually `+ [DRE: Av]`, depending on context
Personal data still exists but cannot be used	`+ [DRE: Ac]` Loss of Accessibility

Availability and accessibility are not the same thing.

If personal data is deleted and no usable copy remains, the issue is availability:

+ [DRE: Av]

The data is gone or unreachable.

If personal data still exists but authorised users cannot use it — for example because ransomware encrypted it — the issue is accessibility:

+ [DRE: Ac]

The data is present, but unusable.

The distinction matters. Ransomware encryption is not automatically the same as destruction. If the encrypted files still exist and the infrastructure can technically reach them, the more precise TLCTC event is Loss of Accessibility, not Loss of Availability.

A note in passing: integrity is not accuracy

The integrity row in the table above hides a second semantic collision that deserves its own essay.

Under Article 32 and CIA-triad thinking, integrity means non-tampering — the data is in the state in which it was last legitimately written. That is what TLCTC [DRE: I] captures, and that is what a personal data breach via “alteration” refers to in Article 4(12).

But GDPR also imposes, in Article 5(1)(d) and Article 16, a quite different obligation: that personal data shall be accurate and, where necessary, kept up to date, with the data subject entitled to rectification. This is not about tampering. It is about whether the record correctly reflects real-world facts about the data subject — their address, their employment status, their consent state, their stated preferences.

These two properties can diverge. A personnel record can be perfectly tamper-free and yet wrong — an outdated address, a stale employment status, an incorrect consent flag. Conversely, a record can have a broken integrity history — an attacker briefly altered it and a recovery process restored it — while ending up in a correct final state.

Only the technical-integrity sense maps to a TLCTC [DRE: I] event with a causal threat path attached. An inaccurate-but-untampered record is a data-quality and compliance problem under the accuracy principle — important, sometimes very expensive, but not a threat cluster. There is no generic vulnerability being exploited; nothing to classify on the cause side.

The conflation of these two distinct properties under the single English word integrity adds yet another layer to the semantic confusion described in this post. That is a topic for its own essay.

Why “data loss” is a dangerous phrase

“Data loss” is one of the most ambiguous phrases in the whole discussion.

It can mean:

loss of confidentiality — the data was leaked;
loss of availability — the data was deleted or no longer reachable;
loss of control — the organisation no longer knows where the data is or who has it;
loss of possession — a device or storage medium containing personal data was lost;
loss of accessibility — the data exists but cannot be used.

These are not the same risk.

A stolen laptop containing unencrypted HR records is mainly:

+ [DRE: C]

A failed backup process that leaves the organisation unable to restore employee records may be:

+ [DRE: Av]

A ransomware case where the files remain on disk but are encrypted is:

+ [DRE: Ac]

A malicious modification of patient records is:

+ [DRE: I]

Calling all of these “data loss” hides the difference between disclosure, destruction, corruption, and unusability — exactly the semantic diffusion TLCTC is designed to prevent.

Same legal label, many causal paths

A privacy lawyer may say:

“We had a personal data breach.”

That may be legally correct.

But from a TLCTC perspective, the next question is:

“Which causal path produced the Data Risk Event?”

The same legal breach category can arise from many different threat paths.

1. SQL injection exposes customer data

#2 + [DRE: C]

The attacker exploits a server-side implementation flaw.
Legal label: personal data breach.
TLCTC cause: #2 Exploiting Server.
Data Risk Event: Loss of Confidentiality.

2. Phishing leads to mailbox access

#9 + [DRE: C] → #4 + [DRE: C]

The attacker manipulates a human, obtains credentials, then uses them to access a mailbox.
Legal label: personal data breach.
TLCTC cause path: #9 Social Engineering → #4 Identity Theft.
Data Risk Event: Loss of Confidentiality.

3. Attacker modifies payroll records

#4 → #1 + [DRE: I]

The attacker uses stolen credentials and abuses legitimate payroll functions to alter bank account data.
Legal label: personal data breach.
TLCTC cause path: #4 Identity Theft → #1 Abuse of Functions.
Data Risk Event: Loss of Integrity.

4. Ransomware encrypts patient records

#9 → #7 + [DRE: Ac]

The attacker manipulates a user or process, malware executes, and personal data becomes unusable.
Legal label: personal data breach.
TLCTC cause path: #9 Social Engineering → #7 Malware.
Data Risk Event: Loss of Accessibility.

Not primarily Loss of Availability if the files still exist and the infrastructure can still reach them. The problem is that authorised processes can no longer use the data.

5. Malware deletes records

#7 + [DRE: Av]

Foreign executable content runs and deletes personal data.
Legal label: personal data breach.
TLCTC cause: #7 Malware.
Data Risk Event: Loss of Availability.

The data is gone or no longer reachable in a usable technical sense.

Accidental personal data breaches are different again

Privacy law also includes accidental breaches. That creates another important TLCTC distinction: not every personal data breach is a cyber threat path.

If an employee accidentally emails a spreadsheet to the wrong recipient, there may be a personal data breach. But if there is no attacker exploiting a generic vulnerability, there may be no TLCTC threat cluster to classify.

That does not make the event irrelevant. It simply means it belongs elsewhere in the Bow-Tie model: an operational error, a process failure, a control failure, a compliance event, a Data Risk Event, a Business Risk Event — but not a threat cluster.

TLCTC must not invent a threat cluster where there is no threat step.

This is where the framework is strict: control failure is not a threat, and outcome is not cause.

The consequence chain: DRE is not the end

In privacy discussions, people often jump directly from “breach” to “notification,” “fine,” or “reputation damage.”

TLCTC slows this down. The consequence side should be read as a chain:

SRE → DRE → BRE₁ → BRE₂ → BRE₃ ...

A system compromise (SRE) may produce a Data Risk Event (DRE). That DRE may then trigger Business Risk Events (BREs).

Example:

#4 → #1 + [DRE: C]

Then consequence chain:

Consequence chain (left-to-right Bow-Tie right side)

SRE → DRE [C] → BRE₁ regulatory notification
              → BRE₂ customer notification
              → BRE₃ media coverage
              → BRE₄ customer churn
              → BRE₅ regulatory fine

The legal notification is not the threat. The fine is not the threat. The reputational damage is not the threat. They are consequence-side events.

This matters for control design. A preventive control against SQL injection acts on the left side of the Bow-Tie. A breach notification process acts on the right side. Both matter, but they do not control the same thing.

Why the distinction matters for controls

If we say only “data breach,” we may select the wrong controls.

A confidentiality breach suggests controls such as:

access control;
encryption;
data loss prevention;
least privilege;
monitoring of exports;
secure authentication;
query restrictions.

An integrity breach suggests additional controls:

change approval;
tamper detection;
transaction logging;
reconciliation;
versioning;
integrity checks;
maker-checker workflows.

An availability breach suggests:

backup;
replication;
disaster recovery;
storage resilience;
deletion protection;
immutable recovery points.

An accessibility breach suggests:

ransomware resilience;
key management;
recovery testing;
endpoint containment;
privilege restriction;
segmentation;
rapid isolation.

The same legal phrase — personal data breach — may therefore require very different technical and organisational responses.

That is the practical value of TLCTC: it does not stop at the legal label. It asks what happened causally, which data-risk dimension was affected, and which business consequences followed.

But here is the problem. Privacy regulation does not ask any of these questions on the practitioner's behalf. Article 32 lists controls without ever stating which threat cluster they prevent, which Data Risk Event they limit, or which side of the Bow-Tie they act on. That is where the checklist trap forms.

Article 32 and the checklist trap

Re-read Article 32 with an architect's eye, and a structural problem appears immediately. The article lists controls drawn from four different abstraction layers in a single sentence:

Article 32 item	Abstraction layer	TLCTC view
(a) Pseudonymisation, encryption	Specific technical control	Indexable to clusters (mitigates `[DRE: C]` after #2, #4, #7, #10)
(b) “Ongoing confidentiality, integrity, availability, and resilience”	Outcome property	These are DREs and system properties, not controls
(c) “Ability to restore availability and access”	Recovery capability	Right-side of the Bow-Tie — limits BREs, does not prevent compromise
(d) Process for testing and evaluating measures	Meta-process	Not a control; a control about controls

This is the messy part. There is no organising principle. A practitioner reading Article 32 cannot derive a threat model from it; only a checklist.

It is also accidental. Encryption and pseudonymisation are named explicitly, but the controls that prevent the dominant causal clusters — firewalling, network segmentation, identity and access management, patch management, secure SDLC, supply-chain due diligence, anti-phishing training, MFA, EDR, backup immutability — none of these appear in the text. Why are encryption and pseudonymisation singled out? Because in the 2014–2016 negotiating period they were politically salient. The list is a snapshot of regulator anxiety, not a model of how attackers actually compromise personal data.

The conditioning effect then follows mechanically:

The regulator publishes an unstructured list of controls.
Supervisory authorities and auditors verify presence of the listed items.
DPOs build internal checklists from the auditor template.
Security teams optimise for “we can show evidence of pseudonymisation, encryption, and tested restore.”
Practitioners learn that doing security equals producing evidence against a list.
The cause-side threat-modelling instinct atrophies. Item-presence thinking replaces threat-cluster thinking.
Net result: encryption is everywhere, but the dominant attack paths (#9 → #4, #10 trust acceptance, #2 server flaws, #7 ransomware) continue to drive breach statistics.

The regulator wanted fewer harms to data subjects. The regulator got more paperwork.

That is not a failure of intent. It is a failure of grammar. If your definition of breach rests on the word “security,” and your prescribed controls are not indexed to the threats that cause breaches, then no amount of auditing will close the loop between obligation and harm reduction.

The deeper problem: control-first regulation has no foundation

Article 32 is not unique. The pattern — mandate controls without first identifying which threats they address — runs through NIS2, DORA, HIPAA, PCI-DSS, and most modern cybersecurity regulation. The standards these regulations cite (ISO/IEC 27001, NIST CSF) require threat identification first, then control selection. The regulations skip the threat-identification step and jump straight to the controls.

I have argued this case at length in The Logical Impossibility of Control-First Regulation. The short version: every examined regulation lacks both a horizontal dimension (which threat does each control address?) and a vertical dimension (does the control prevent compromise or limit damage after compromise?). Without those two dimensions, controls float free of any causal anchor. Compliance and security drift apart.

The breach-versus-personal-data-breach confusion described in this post is one symptom. The Article 32 control salad is another. Both come from the same root: the law defines obligations in terms of outcomes and lists, while attackers operate in terms of causes and paths. The two languages do not connect.

TLCTC connects them. Every TLCTC control statement has a defined causal target:

“MFA prevents #4 Identity Theft on the cause side.”
“Immutable backups limit [DRE: Ac] and [DRE: Av] on the consequence side after #7 Malware.”
“Egress monitoring detects exfiltration along the #1 Abuse of Functions path after #4.”
“SBOM and vendor attestation reduce exposure to #10 Supply Chain Attack at the Trust Acceptance Event.”

Each control names the cluster, the side of the Bow-Tie, and the risk event it acts on. That is what an Article 32 written in TLCTC terms would look like: not a flat list, but a two-dimensional control catalogue indexed to threat clusters and Bow-Tie position.

The central thesis

The term data breach has become semantically overloaded.

In everyday cybersecurity speech, it usually means:

Loss of Confidentiality

In privacy/legal speech, under GDPR-style wording, a personal data breach may involve:

Loss of Confidentiality

Loss of Integrity

Loss of Availability

Loss of Accessibility

Both terms rest on a third overloaded word — data security — which the law uses simultaneously as a system property and as a compliance scope.

In TLCTC, none of these are threat clusters. They are Data Risk Events attached to causal attack paths, sitting on the consequence side of the Bow-Tie.

So the correct TLCTC grammar is not:

Data Breach

It is:

#X → #Y + [DRE: C/I/Av/Ac] → BRE...

Or, more concretely:

#2 + [DRE: C]

#4 → #1 + [DRE: I]

#7 + [DRE: Ac]

#7 + [DRE: Av]

The cause remains on the left side.
The Data Risk Event sits on the consequence side.
The legal and business consequences follow after that.

And the controls — if they are to do anything more than satisfy auditors — must be indexed to the cluster they prevent and the DRE they limit, not merely listed in a regulation as items worth doing.

Conclusion: privacy law accidentally proves the TLCTC point

Privacy law did not merely rename data breaches. It widened the concept, redefined it in terms of an already-overloaded word, and attached a control list that sits at four different abstraction layers without any causal organising principle.

A personal data breach is not only about someone seeing data they should not see. It can also be about personal data being changed, destroyed, lost, or made unavailable. That broader legal meaning is useful for protecting data subjects — but only if the controls deployed against it are tied to the threats that actually cause those outcomes.

Article 32, as written, does not make that connection. So practitioners build checklists. Auditors verify checklists. Generations of professionals are conditioned to mistake item-presence for risk reduction. And the original goal — fewer harms to data subjects — recedes further into the distance every audit cycle.

TLCTC restores precision.

It says:

Do not classify the breach. Classify the cause.

Then record the Data Risk Event — + [DRE: C] for confidentiality, + [DRE: I] for integrity, + [DRE: Av] for availability, + [DRE: Ac] for accessibility. Then record the business or legal consequences as BREs. And then — only then — design controls that name the cluster they prevent, the DRE they limit, and the side of the Bow-Tie they act on.

That is the clean separation:

Threat path → Data Risk Event → Business/Legal Consequence

Or in TLCTC language:

Cause first. Outcome second. Consequence third.

If privacy regulation were rewritten on that foundation, Article 32 would not be a list. It would be a matrix — and every cell would answer the only question that ever mattered: which threat cluster does this control prevent, and which Data Risk Event does it limit?

That is the regulation we still owe to data subjects.

References

Regulation (EU) 2016/679, Article 4(12) and Article 32, EUR-Lex: eur-lex.europa.eu
UK Information Commissioner's Office, “Personal data breaches: a guide”: ico.org.uk
European Data Protection Board, “Guidelines 9/2022 on personal data breach notification under GDPR”: edpb.europa.eu
Kreinz, B., “The Logical Impossibility of Control-First Regulation”: tlctc-control-first-regulation.html
TLCTC White Paper v2.1, “Threats Are Causes, Not Outcomes”; TLCTC Glossary v2.0/v2.1, Data Risk Event and Business Risk Event definitions.

About the methodology

This article applies the TLCTC (Top Level Cyber Threat Clusters) v2.1 framework's Bow-Tie model and cause-side taxonomy to privacy law's data-breach vocabulary. TLCTC is a cause-oriented, actor-agnostic cyber threat classification framework that provides 10 non-overlapping threat clusters, each defined by exactly one generic vulnerability. The SRE → DRE → BRE notation represents the three-stage consequence chain: System Risk Event (loss of control), Data Risk Event (impact on data confidentiality, integrity, availability, or accessibility), and Business Risk Event(s) (organisational and legal consequences). For more information: tlctc.net.