● Research Showcase

Hybrid Vulnerability Detection:
ML + Policy-as-Code

CodeBERT (ML) and Semgrep (PaC) combined with score normalization and a minimum policy weight — evaluated on the DiverseVul C/C++ dataset.

ML-Only (CodeBERT)

Probabilistic detection

Outputs a continuous confidence score learned from 260k+ labeled functions

Catches semantic patterns

Detects subtle issues like timing side-channels or logic errors — no explicit rules needed

Low precision on its own

P=0.127, F1=0.186 on test set; needs policy to improve signal quality

PaC-Only (Semgrep)

Deterministic rules

Every match is explicit and auditable; rules target top CWEs (CWE-119, 416, 476, 190…)

High precision, low recall

P=0.206 but Recall=0.033; fires on a small fraction of the test set

Misses semantic bugs

Timing attacks, logic errors, and non-trivial UAF patterns are invisible to rules

Hybrid Pipeline

📄

Input Code

C/C++ function (DiverseVul)

→

🧠

ML (CodeBERT)

P(vulnerable) → [0,1]

→

🔍

PaC (Semgrep)

Findings → score [0,1]

→

⊕
Hybrid Fusion
α·norm(ML) + β·norm(PaC)

→

🚦

Decision

Approve / Review / Block

Tuned Config

α (ML weight) 0.75

β (PaC weight) 0.25

t_block 0.40

t_review 0.10

min_pac_weight 0.20

Scores normalized to [0,1] using validation set min/max before fusion. Ensures both channels are on a comparable scale.

Code Scenarios — Where Hybrid Wins

All Scenarios at a Glance

Illustrative scores derived from real Phase 3 config (α=0.75, β=0.25, t_block=0.4, t_review=0.1, normalization applied).

Scenario	CWE	ML raw	ML decision	PaC score	PaC decision	Hybrid risk	Hybrid decision	Story

Live Analysis — Try It

API URL Local API URL to enable live analysis (run the API from the repo with USE_REAL_ML=1 uvicorn api.main:app --reload)

Note: Live analysis with real ML only works when using a local URL (e.g. http://localhost:8000) or a GPU-backed deployment. Other URLs will use heuristic ML only.

your_code.c

📂

Drop C/C++ files here, or click to browse

.c · .cpp · .h · .hpp — up to 20 files, 512 KB each. You can drop a folder.

Hybrid Vulnerability Detection:ML + Policy-as-Code

Hybrid Vulnerability Detection:
ML + Policy-as-Code