IoT Honeypot Digital Twin with Adaptive AI
Defenses
1st Given Name Surname
dept. name of organization (of Aff.)
name of organization (of Aff.)
City, Country
email address
Abstract—We present a comprehensive methodology for building a Digital Twin of an Internet-of-Things (IoT) environment that
integrates high-interaction honeypots, streaming telemetry, and
adaptive AI/ML defenses. The twin continuously mirrors network
and device states, attracts adversaries via deception, learns
attacker tactics, and recommends safe mitigations. Core components include (i) graph- and sequence-based anomaly detection,
(ii) attacker stage/intent prediction aligned to MITRE ATT&CK,
(iii) a cost-aware mitigation and deception planner with safety
constraints, and (iv) what-if simulation inside the twin prior to
enforcement. We detail the architecture, data model, algorithms,
evaluation metrics, and a reproducible implementation blueprint
suitable for academic and applied research.
Real IoT
Env
Honeypots
(CCTV/MQTT/etc.)
Telemetry Ingest
(pcap/Zeek/MQTT logs)
I. S COPE & T HREAT M ODEL
Environment. Small to medium IoT deployment: IP cameras, smart plugs, sensor hubs (MQTT/CoAP/HTTP), a gateway, DNS/DHCP, cloud APIs.
Adversary. External botnet (e.g., Mirai-family), opportunistic
scanners, credential stuffers, and rogue internal nodes.
Goals. (G1) Early detection; (G2) Attacker engagement &
intelligence via honeypots; (G3) Risk-limiting mitigations with
minimal disruption; (G4) Safe policy trials in the twin before
live changes.
II. S YSTEM OVERVIEW
Fig. 1 depicts the pipeline: Real Env ↔ Digital Twin
synchronization; honeypots; ingest; feature store; detection;
intent prediction; planner; enforcement hooks; and explainability/audit.
III. DATA M ODEL & F EATURES
A. Telemetry Sources
Network: pcap/NetFlow/IPFIX, Zeek/Suricata logs (flows,
DNS, HTTP).
• Protocol-aware: MQTT (topic, QoS, retained flags, payload
length stats), CoAP/HTTP summaries.
• Host (where feasible): CPU/mem, process list deltas.
• Honeypot events: connection attempts, credentials tried,
commands, binaries dropped, TTP signatures.
•
B. Feature Engineering (window ∆=60s)
Per-device: bytes/packets in/out, unique peers, port entropy,
inter-arrival CV, SYN/FIN/RST ratios, DNS query rate, NXDOMAIN fraction.
Fig. 1. Architecture of the IoT Honeypot Digital Twin with Adaptive AI
Defenses.
Protocol: MQTT pub/sub rates, topic fan-out, retained count,
QoS distribution, payload size stats, actuator command frequency.
Graph/topology: degree drift, new edge types, shortest path
to WAN, broker betweenness.
Threat intel: hits on bad IP/domain feeds; CVE exposure
from CPE strings.
IV. D IGITAL T WIN M ECHANICS
State Mirror. The twin maintains a typed graph Gt =
(Vt , Et ) of devices, services, brokers, and honeypots with
attributes (firmware, zones, ACLs).
Sync. Streaming ETL maps live telemetry → twin state. Drift
detectors flag inconsistencies.
What-if. Policies (ACLs, VLANs, deception moves) are trialed in the twin; only safe, reversible actions are auto-enforced.
V. H ONEYPOT & D ECEPTION D ESIGN
Honeypots. Mix of low-, medium-, and high-interaction:
Telnet/SSH (credential capture), HTTP (fake camera panels),
MQTT brokers/topics, and ICS-like services when relevant.
Attraction. Service/port shuffling, realistic banners, seeded
credentials, and shadow assets in ARP/DNS.
Adaptive Deception. Based on attacker stage/intent, adjust
honeypot fidelity, reveal decoy data, or shift services to increase engagement while protecting production assets.
VI. AI/ML C OMPONENTS
A. Unsupervised Anomaly Detection
Sequence forecaster. A GRU/Transformer predicts nextwindow vector x̂t+1 ; anomaly score aseq
t = ∥xt+1 − x̂t+1 ∥1 .
Autoencoder. Role-specific AEs per device class (cam2
era/sensor/actuator) with reconstruction loss aae
t = ∥xt − x̃t ∥2 .
Graph anomaly. A GNN reconstructs edges/attributes; score
agt from adjacency/feature reconstruction error.
B. Supervised Signatures (optional)
Lightweight XGBoost/MLP for known families (scan, brute
force, DDoS, MQTT abuse), trained on public IoT datasets and
twin-generated traces.
C. Stage/Intent Prediction
We model attacker progression as stages st
∈
{R ECON, I NITIAL ACCESS, E XEC, P ERSISTENCE, C2, I MPACT}
aligned with ATT&CK. A sequence model fθ maps event
windows to stage posteriors:
p(st | h1:t ) = fθ (h1:t ),
(1)
where ht concatenates detection scores, honeypot events, and
graph features.
D. Calibration & Fusion
Calibrate scores via temperature scaling to obtain probabilities p̂k . Fuse via logistic meta-model or Bayesian averaging:
!
X
Pt (comp) = σ β0 +
βk p̂k + γ ⊤ ct ,
(2)
k
B. Scenarios
1) Benign diurnal usage, firmware updates, background
noise.
2) Mirai-like scan & brute-force leading to C2 beacons and
DDoS.
3) MQTT abuse: credential stuffing, topic hijack, retainbomb, replay.
4) Exfiltration: DNS tunneling or periodic HTTP beacons.
5) Low-and-slow lateral exploration (new edges, community
drift).
C. Datasets (training/augmentation)
with context ct (exposure, controls, criticality).
Combine public IoT intrusion datasets with twin-generated
traces; split temporally for evaluation. (Examples: IoT intrusion corpora with flows/MQTT, plus your honeypot logs.)
VII. R ISK & U TILITY
Define instantaneous device risk:
Rt (d) = Pt (comp | xt ) · I(d) · Et (d) · (1 − Ct (d)),
Algorithm 1 Online Defense Loop with Twin-Gated Actions
1: Input: Streams S, twin Gt , detectors D, planner Π, safety
constraints Ω
2: while true do
3:
Ingest window; update features and Gt
4:
Compute anomaly scores {ak } and stage posteriors
p(st )
5:
Fuse to get Pt (comp) and compute Rt (d) for all d
6:
Candidate actions A ← Π.P ROPOSE(Rt , p(st ), Gt )
7:
for a ∈ A do
(a)
8:
Simulate a in twin → Rt , check constraints Ω
9:
Evaluate Ut (a); discard if constraint-violating or
Ut (a) ≤ 0
10:
end for
11:
Apply top-K reversible actions; queue others for human approval
12:
Log decisions, attributions, and outcomes (for offline
learning)
13: end while
(3)
X. M ETRICS
VIII. M ITIGATION & D ECEPTION P LANNER
Action set: isolate/quarantine, micro-segment (SDN/ACL),
throttle, rotate creds, block IP/domain, MQTT ACL edits,
disable feature, increase deception fidelity.
Safety. Hard constraints prohibit unsafe actions on critical
assets; only reversible actions are auto-enforced.
Detection. AUROC, AUPRC, F1@FAR=1%, Mean Time To
Detect (MTTD).
Calibration. Brier score, Expected Calibration Error (ECE).
Deception effectiveness. Diversion rate (% attacks engaging honeypot), dwell time on decoys, intel yield (unique
creds/binaries/TTPs).
Planning benefit. Risk reduction ∆R, Mean Time To Contain
(MTTC), service disruption minutes, false isolation rate.
Twin fidelity. Distributional similarity between live and twin
(e.g., MMD or KL over flow features).
Overhead. Latency added by policies, CPU/memory footprint.
IX. E XPERIMENTAL D ESIGN
XI. I MPLEMENTATION B LUEPRINT
A. Testbeds
Physical lab: a few COTS devices (IP camera, smart plug,
sensor hub). Emulation: containers/VMs to scale endpoints
and services. Honeypots: Telnet/SSH/HTTP/RTSP, MQTT
decoys, optional ICS decoy.
Ingest. Zeek/Suricata → Kafka; MQTT broker logs →
Kafka.
Storage. Timeseries (TimescaleDB/InfluxDB); Graph (Neo4j);
pcaps in object store.
Models. PyTorch/Lightning; MLflow for experiments; SHAP
where I is impact, Et exposure, and Ct current control
effectiveness. For an action a on set S, define utility:
X
(a)
Ut (a) =
Rt (d) − Rt (d) − λ · Costt (a).
(4)
d∈S
for attributions.
Twin/Orchestration. Docker Compose or Kubernetes (with
Calico for network policy); SDN via Open vSwitch/Ryu for
micro-segmentation.
Honeypots. Mix of low/high-interaction services (Telnet/SSH/HTTP/RTSP/MQTT) instrumented to emit rich
events.
Dashboard. Grafana for telemetry; custom web UI for risk
heatmaps, stage timelines, and what-if controls.
XII. E VALUATION P ROTOCOL
Splits. Train/val/test by time to avoid leakage.
Baselines. Thresholding on z-scores; classical IDS signatures
only; unsupervised-only (no fusion); planner off (alert-only).
Ablations. −Graph features; −Protocol features; −Honeypot
signals; −Calibration; −Twin gating.
Robustness. Packet loss up to 20%, clock skew, partial visibility (dark assets).
User Study. Analysts triage with/without twin-assisted recommendations; measure triage time and accuracy.
XIII. E THICS , S AFETY, & L IMITATIONS
Limit outbound attack propagation (egress filtering), clearly
segregate honeypots, and never store sensitive payloads in the
clear. Ensure deception does not disrupt essential services.
Limitations include emulation gaps, dataset bias, and transferability to large-scale deployments.
XIV. R EPRODUCIBILITY C HECKLIST
IaC scripts for all components; fixed seeds; pinned versions.
• Released configs for scenarios and attack replays.
• Logged decisions and outcomes for offline analysis.
• Open-sourced notebooks for metric computation and plots.
•
XV. M ILESTONES (12 W EEKS )
1) W1–2: Ingest + feature store + basic honeypots.
2) W3–4: Seq/AE detectors; calibration.
3) W5–6: Graph anomaly; stage/intent model.
4) W7–8: Planner + twin what-if + safety checks.
5) W9–10: Scenarios, baselines, ablations.
6) W11–12: User study, paper writing, artifact packaging.
ACKNOWLEDGMENTS
(Optional) Acknowledge lab resources and collaborators.
R EFERENCES
[1] MITRE ATT&CK for Enterprise. Online resource.
[2] Representative Public IoT Intrusion Datasets (flows, MQTT, botnet
traces).
[3] Zeek Network Security Monitor. Online resource.
[4] Suricata IDS/IPS/NSM. Online resource.
[5] Project Calico: Kubernetes networking and network policies.