Engineering the Bridge: TLCTC × SonarQube

Series Finale — and the integration is shipped

We began with the roles (Part 1) and explored the strategic value (Part 2). Part 3 used to be an architectural sketch in Java. It's now a real deliverable: integrations/sonarqube/ — a Python sidecar plus a zero-code declarative starter. Read on for the design choices, then clone the repo.

The Engineering Challenge: "What" vs "How"

Static Application Security Testing (SAST) tools like SonarQube speak the language of CWE (Common Weakness Enumeration). This is a dictionary of "what" is wrong (e.g., "Improper Neutralization of Input").

Risk Management speaks the language of TLCTC. This is a taxonomy of "how" and "where" a system is compromised (e.g., "Exploiting Server").

To bridge this gap, we cannot simply map 1-to-1. We need Context-Aware Mapping. A buffer overflow (CWE-120) is not just a bug; it is a potential #2 Exploiting Server event if found in an API, or a #3 Exploiting Client event if found in a browser component.

The 987-entry mapping does the heavy lifting

The previous draft of this article invented its own CWE→TLCTC dictionary. That was a mistake — a canonical mapping already exists in the TLCTC repository, and it is far richer than a hand-written table can be. The integration consumes it directly, never forks it.

mappings/mitre-cwe/tlctc-cwe.json carries 987 reviewed CWE entries. Each entry has a path-notation mapping (e.g. #2, #2 → #7, #2 → #7 | #3, or N/A), a human-readable rationale, a CVE reference list, a contextDependent flag, and a verdict — the confidence tier that tells the SAST pipeline how to treat the finding:

Verdict	Count	Pipeline behaviour
Allowed	756	Classify, tag, count in cluster summary.
Allowed-with-Review	16	Same as Allowed, but excluded from CI gating under `--strict-verdict`.
Discouraged	171	No tag. Surfaced in a collapsible “low-confidence” section. CWE-20 lives here.
Prohibited	44	Silently skipped. Category / View / Deprecated nodes — not concrete weaknesses.

229 of the 987 entries are contextDependent: true — meaning the same CWE maps to different clusters depending on whether the vulnerable code is server-role or client-role (R-ROLE). The integration resolves these at scan time using configurable file-path globs, so a buffer overflow in your backend tags tlctc-02 and the same CWE in a WASM client tags tlctc-03 — without you maintaining a single line of mapping logic.

Why no “TLCTC-XX.YY” enumeration?

An earlier draft proposed a TLCTC-XX.YY identifier with the .YY reserved for sub-categorization. The integration drops it. There is no source-of-truth for the suffix, the canonical mapping never uses one, SAST findings carry no lifecycle state to anchor an operational ID against, and SonarQube's tag namespace forbids the # character anyway. Instead the integration uses two stable forms: canonical #N notation in every human-facing report (matching the whitepaper and the mapping file) and a normalised tlctc-NN form (lowercase, zero-padded) for SonarQube tags. Two encodings, one cluster identity, zero invented ID space.

The decision flow

With the canonical mapping doing the lookup, the integration's own logic shrinks to a tiny, testable kernel: parse the mapping string into a typed AST, resolve any ambiguous branch against the file path, emit reports. The figure below is the conceptual flow — and it is exactly what the Python kernel implements, line for line.

Figure 1: Context-Aware Mapping Logic Decision flow

Click to Enlarge

Visualising the decision flow for dynamic tagging.

Implementation: a Python sidecar against the Web API

The integration is not a SonarQube Java plugin. It is a CLI that runs in your CI step, calls /api/issues/search, joins each issue's CWE references against the canonical mapping, and writes JSON / Markdown / SARIF reports. Stdlib only — urllib, argparse, tomllib, fnmatch. No pip install, no Maven, no plugin restart cycle. It also runs against SonarCloud, where Java plugins cannot.

CI step

python -m cli classify \
    --sonar-url "$SONAR_URL" --token "$SONAR_TOKEN" \
    --project-key your.project.key \
    --pull-request "$PR_NUMBER" \
    --out-md    tlctc-pr-comment.md \
    --out-sarif tlctc.sarif \
    --fail-on-cluster '#7,#10'

The CLI pulls issues, applies the verdict filter, runs R-ROLE resolution on alternations, and emits three artefacts. The SARIF goes to GitHub code scanning. The Markdown becomes a PR comment. The exit code gates the merge when findings land in clusters you've nominated as blocking.

tlctc-pr-comment.md — excerpt

## TLCTC SAST Report
_Project: `demo` · TLCTC v2.1_

**Findings classified:** 6 | **Low-confidence:** 1 | **Total issues seen:** 7

### Cluster Exposure

| Cluster | Name | Count |
|---|---|---:|
| `#2`  | Exploiting Server   | 3 |
| `#3`  | Exploiting Client   | 1 |
| `#4`  | Identity Theft      | 1 |
| `#10` | Supply Chain Attack | 1 |

### `#2` Exploiting Server — 3 finding(s)

- **ISSUE-002** — `src/main/java/.../JspRenderer.java`
  - CWE-79 → `#2 → #7` (Allowed) — component path matched server glob '**/src/main/java/**'
- **ISSUE-003** — `src/main/webapp/.../CommentList.tsx`
  - CWE-79 → `#3` (Allowed) — component path matched client glob '**/*.tsx'

Note ISSUE-002 and ISSUE-003: the same CWE-79 resolves to #2 → #7 on the server file and #3 on the client file. That is R-ROLE doing its job — and it is the single biggest argument against hand-built CWE→cluster tables. Any tool that maps CWE-79 to one cluster is wrong for at least half of its findings.

R-ROLE: how server vs client gets resolved

Context dependency is configured in TOML (or JSON if your Python is 3.10). Two glob lists, one default role, and an algorithm that checks the client globs first because they are typically more specific (a .tsx file inside a src/main/java/ tree is rare, but when it appears it should win):

tlctc-sonar.toml

[tlctc-sonar]
mapping_file = "../../mappings/mitre-cwe/tlctc-cwe.json"
default_role = "server"

[role.server]
globs = [
  "**/src/main/java/**", "**/server/**", "**/api/**",
  "**/handlers/**", "**/routes/**", "**/*Controller.java",
]

[role.client]
globs = [
  "**/src/main/webapp/**", "**/frontend/**",
  "**/*.tsx", "**/*.jsx", "**/*.vue",
]

Every classification carries its resolution trace into the report: which glob matched, which role was inferred, and which branch of the alternation was taken. If R-ROLE picks the wrong branch — and it sometimes will — the Markdown shows you exactly why, so you tune the globs instead of overriding individual findings.

Mapping landmarks (from the canonical file)

These are excerpts — the source of truth is mappings/mitre-cwe/tlctc-cwe.json (987 entries) and its decision tree. The integration consumes the file directly, so this table is documentation, not code.

CWE category	Sample CWE	Mapping	Verdict	Reading
Access control	CWE-285, CWE-639	#1	Allowed	Abuse of Functions — designed logic, not a coding flaw.
SQL injection	CWE-89	#2	Allowed	Exfiltration via the server's data layer. SQLi alone does not execute foreign code.
XSS	CWE-79	#2 → #7 \| #3	Allowed (context-dependent)	Server-rendered XSS = `#2 → #7`; DOM XSS = `#3`. R-ROLE picks the branch.
Deserialization (RCE)	CWE-502	#2 → #7 \| #3 → #7	Allowed (context-dependent)	Both branches end in `#7`: untrusted data resolves into executable form on either side.
Hardcoded credential	CWE-798	#4	Allowed	R-CRED — credential application is always `#4`, regardless of how the credential was obtained.
Memory error (role-dep.)	CWE-119, CWE-787	#2 \| #3	Allowed (context-dependent)	One of the 229 R-ROLE entries. File path decides.
Supply chain (untrusted include)	CWE-829, CWE-494	#10 → #7	Allowed	R-SUPPLY — `#10` placed at the Trust Acceptance Event; the chained `→ #7` records execution.
Generic input validation	CWE-20	N/A	Discouraged	Umbrella CWE — root cause of multiple clusters depending on the abuse. Surfaces in the low-confidence section.

Get the integration

The pack lives at integrations/sonarqube/ in the TLCTC repository. Two tiers ship together:

The CLI sidecar — cli/, Python 3.11+ recommended (3.10 supported with a JSON config). Read-only by default; --apply-tags opts into write-back via /api/issues/set_tags.
The declarative starter — declarative/, zero Python. A tag-import CSV (with cluster descriptions and definition links), a quality profile that uplifts severity on R-EXEC / R-CRED / supply-chain CWEs, and a portfolio dashboard JSON. Use the two tiers together: declarative at platform setup, CLI in CI.

Try it offline

git clone https://github.com/Barnes70/TLCTC.git
cd TLCTC/integrations/sonarqube

# Validate the config + mapping integrity
python -m cli validate --config examples/tlctc-sonar.json

# Classify against the canned fixture (no SonarQube needed)
python -m cli classify \
    --config       examples/tlctc-sonar.json \
    --sonar-url    "file://$PWD/examples/sample-issues-response.json" \
    --token        x --project-key demo \
    --out-md       /tmp/report.md \
    --out-sarif    /tmp/report.sarif

# Run the 40 unit tests
python -m unittest discover tests

Then read deploy.md for the CI wiring and test-cases.md for the seven test cases (R-ROLE, R-EXEC, R-CRED, R-SUPPLY, Axiom III verdict filter, Axiom VI single-cluster, and the canonical-mapping sequence parser).

Conclusion

Three things changed between the draft of this article and the version you are reading. First, the integration moved from a Java sketch to a real Python sidecar that runs against both self-hosted SonarQube and SonarCloud. Second, the invented TLCTC-XX.YY ID space was dropped — the canonical mapping never used it and SAST findings have no lifecycle state to anchor an operational ID against. Third, the in-blog CWE table shrank to landmarks, because there is now exactly one source of truth: tlctc-cwe.json, 987 reviewed entries, verdict-tiered, R-ROLE-aware.

SAST output now translates — deterministically, with provenance, on every PR — into the same cluster vocabulary your CISO uses on the risk register. That is the whole point of TLCTC at the SAST boundary: not to add another scoring system on top of CVSS, but to make the engineering team's findings legible to the people accountable for cluster-level exposure.

References

Kreinz, B. TLCTC White Paper V2.1 — cluster definitions, axioms, R-ROLE / R-EXEC / R-CRED / R-SUPPLY rules. Read V2.1.
TLCTC SonarQube integration — pack, deploy runbook, test cases. github.com/Barnes70/TLCTC — integrations/sonarqube.
Canonical CWE → TLCTC mapping — 987 reviewed entries, audit history, verdict system. github.com/Barnes70/TLCTC — mappings/mitre-cwe.
CWE List, MITRE Corporation. cwe.mitre.org.
SARIF 2.1.0 specification, OASIS. docs.oasis-open.org/sarif/sarif/v2.1.0.