Catching Shadow AI in the Network: DNS, Egress, and Browser Telemetry with Wazuh • Nadim Saliby

The previous post in this series — Detecting Rogue MCP Servers and Shadow AI Agents on Endpoints with Wazuh — covered the agentic-coding side of shadow AI: rogue MCP servers, indirect prompt injection, the node *-mcp.js process tree on a developer’s laptop. That detection pack catches the IDE side.

It doesn’t catch the browser side, and the browser side is where most of the actual shadow AI is. ChatGPT in a tab. Claude.ai bookmarked. A Chrome extension that “summarizes the page” and quietly POSTs every page body to an LLM provider. A Firefox extension that injects a sidebar with https://*.anthropic.com/* in its host_permissions. A native messaging host that bridges a browser extension to a local helper binary. None of those leave a fingerprint on the IDE process tree.

What they do leave is a fingerprint on the network: a DNS query for api.openai.com, a TLS connection to 160.79.104.10:443 from a chromium (or python3, or curl) process, a new file written under ~/.config/google-chrome/Default/Extensions/, a JSON manifest with host_permissions: ["https://api.openai.com/*"]. Each of those is observable from a single Wazuh agent on the endpoint, with no kernel module, no eBPF, and no traffic mirror.

I built a Wazuh rule pack and a reproducible Docker lab that operationalizes that observation. 4 custom decoders, 15 custom rules, and a one-shot docker compose up lab that captures 185 alerts across all 15 rule IDs when you fire the attack scenarios end to end. Everything is in a public GitHub repository. Clone it, drop the rules in your manager, fire the triggers, watch alerts land.

This post is a technical deep-dive into the threat model, the lab architecture, the decoder + rule design, and the live validation results — including what the red-team agent broke and what the blue-team agent flagged before this shipped.

The Problem

The previous post’s threat model was: what does an MCP-aware coding agent do on the endpoint? This post’s threat model is: what does a developer’s browser do on the endpoint? — because in 2026, almost all of them are talking to an LLM provider, almost none of those endpoints have any policy declaring the activity is approved, and the perimeter logs are usually too coarse to attribute the traffic back to a specific extension or a specific tab.

Concretely:

Browser extension installs are user-initiated and silent. Chrome’s chrome.google.com/webstore, Firefox’s addons.mozilla.org, edge add-ons, brave add-ons, sideloaded ZIPs from a GitHub release — all of them write a manifest under ~/.config/<browser>/Default/Extensions/<id>/<version>/manifest.json. There is no central registry of installed extensions per fleet, no review queue inside the corporate MDM, and nothing in the browser surface that says “this extension just gained host_permissions for an LLM provider.”
Native messaging hosts are persistence. A native messaging host JSON under ~/.config/google-chrome/NativeMessagingHosts/<name>.json lets an extension talk to a local binary outside the browser sandbox. Real password managers and corporate SSO clients use the mechanism legitimately; so do exfil bridges that keep a TLS connection to an LLM provider open across browser restarts.
DNS is the cheapest tell. Every browser tab that loads chat.openai.com issues a DNS query. Every extension that calls fetch('https://api.openai.com/v1/chat/completions') issues a DNS query. Endpoints that resolve cloudflare-dns.com or dns.google are usually about to issue a DNS-over-HTTPS request that bypasses this whole pipeline — and that lookup itself is a useful signal.
Egress is the second-cheapest tell, with process attribution. ss -tnp snapshots the established TCP table, attributes each socket to the owning process by comm, and surfaces “developer’s python3 script just opened a TLS session to 162.159.140.245” the same way it surfaces “developer’s chromium did”. For shadow-AI fleet inventory the difference between “this is the browser” and “this is a CLI script” is the difference between normal and paging.
The blast radius scales with the org. A single extension paste from a Discord recommendation, deployed once on a senior engineer’s laptop, gets one extension onto the developer machines that have access to the most source code in the org.

Stock Wazuh has none of this. There is no dnsmasq decoder shipped, no ss periodic snapshot ingester that survives the Wazuh 4.10 <full_command> framing change, no rule for “Chrome extension declared LLM host_permissions”, no chain rule for “extension install + LLM provider DNS within 5 min”.

This post fixes that.

Threat Model

I categorize the network-side shadow-AI surface into four classes. The detections in this post target all four.

Class	What it looks like	Detection goal
A — Direct LLM API egress	A developer pastes `curl https://api.openai.com/v1/chat/completions` into a terminal; or a CI script POSTs source code to Claude for review	DNS lookup for `api.<provider>.com`, then TLS connect to a known provider IP from a non-browser/non-IDE process
B — Browser tab use	Developer has `claude.ai` / `chat.openai.com` / `gemini.google.com` open in a tab and pastes proprietary code in for “explain this for me”	DNS lookups + TLS connects from `chromium` / `firefox`. Single-tab activity is informational (fleet inventory); a burst across multiple providers in 60 s is paging
C — Browser extension install	Developer installs an extension that declares `host_permissions: ["https://api.openai.com/*", ...]` and quietly POSTs page bodies to an LLM provider	FIM new file under `~/.config/<browser>/Extensions/`, browser-driver event with the manifest payload, native-messaging host with an AI-themed name
D — DoH bypass attempt	Endpoint or extension explicitly tries to use DNS-over-HTTPS so the local resolver never sees the LLM provider lookup	DNS lookup for `cloudflare-dns.com`, `nextdns.io`, `dns.google`, etc.

The headline detection — what makes the pack worth shipping — is the chain rule that joins C → A/B: a browser extension manifest is written, then within five minutes a DNS lookup for an LLM provider lands. Each event is innocuous in isolation. The join is the activation footprint of a freshly installed shadow-AI extension.

Architecture: One Compose, Two Containers

The repository ships a two-container Docker Compose stack. One docker compose up -d --build and you have:

                ┌─────────────────────────────────────┐
                │   waznet-manager                    │
                │   wazuh/wazuh-manager:4.10.0        │
                │   • bind-mounts                     │
                │     wazuh/decoders/                 │
                │     wazuh/rules/                    │
                │   • authd password enabled          │
                │   • alerts → /var/ossec/logs        │
                └────────────┬────────────────────────┘
                             │ 1514/udp (events)
                             │ 1515/tcp (authd)
                             ▼
                ┌─────────────────────────────────────┐
                │   waznet-endpoint                   │
                │   ubuntu:22.04 + Wazuh agent 4.10.0 │
                │                                     │
                │   • dnsmasq on 127.0.0.1:53         │
                │     query log → dns-queries.log     │
                │   • egress-watcher.sh (ss -tnp,     │
                │     3-second snapshot daemon)       │
                │   • listener-watcher.sh (ss -lntp)  │
                │   • browser-driver.py (Python —     │
                │     replaces Chrome/Firefox)        │
                │   • 6 attack trigger scripts        │
                └─────────────────────────────────────┘

Both containers come up clean from a docker compose down -v. The endpoint’s agent self-enrolls against the bundled manager via agent-auth against authd on port 1515, password-based (wazuh/manager/authd.pass). To deploy against an existing manager, set WAZUH_MANAGER and only build the endpoint service.

There’s no GUI Chromium or Firefox in the container — those are GUI applications that don’t run cleanly headless. I drive the browser-side activity with a Python browser-driver that uses prctl(PR_SET_NAME, "chromium") to set its kernel comm to whatever browser name the scenario claims to be, then performs the same observable behaviors a real browser tab would: DNS lookups via socket.gethostbyname, real TLS connects via the ssl module that hold open long enough for the egress watcher to snapshot them, manifest writes under the real Chrome and Firefox extension paths, native messaging host registrations under the well-known directories. The detection logic targets the symptoms (process comm, dnsmasq query log, on-disk manifest content), not the specific browser version. A real Chrome talking to api.openai.com produces the same evidence the driver does.

The endpoint is deliberately not run as --privileged. It does need CAP_SYS_PTRACE so ss -tnp can attribute sockets to the owning process — without it, the users:(("comm",pid=...)) column comes up empty and every process-attribution rule silently fails. That is the kind of trap that bites in default-namespaced Docker on a real deployment, and it is documented in the lab’s compose file with the rationale inline so nobody else has to debug it from scratch.

Detection Vectors: Four Sources, Four Decoders

Four Wazuh data sources feed the rule pack, plus stock FIM. Each gets its own decoder, with a strict naming namespace so its fields never collide with stock decoders.

Source	Decoder	What it sees
`syslog /var/log/lab/dns-queries.log`	`dnsmasq` + `dnsmasq-query` + `dnsmasq-reply`	Every DNS query and every reply through the local resolver
`syslog /var/log/lab/browser-driver.log`	`browser-driver` + `browser-driver-fields`	Browser-side events — extension installs, native host registrations, simulated browses
`syslog /var/log/lab/egress-watch.log`	`egress-watch` + `egress-watch-conn`	One row per established TCP connection, every 3 s, with `proc=` attribution
`syslog /var/log/lab/listener-watch.log`	`listener-watch` + `listener-watch-listen`	One row per listening socket, every 15 s, with `proc=` attribution
Stock Wazuh FIM	(existing rule 554)	New / changed files under `~/.config/google-chrome/Default/Extensions/`, `~/.mozilla/firefox/.../extensions/`, `~/.config/google-chrome/NativeMessagingHosts/`, `~/.mozilla/native-messaging-hosts/`

The decoder file is short and built to be read top-to-bottom. Every <regex> is type="pcre2" because OSSEC-flavor regex on Wazuh 4.10 doesn’t accept brace literals like \{ or \} and chokes on the JSON-payload captures the browser-driver decoder needs. PCRE2 is supported on Wazuh ≥ 4.4 and is strictly more capable for the kind of structured-text shaping the lab emits.

dnsmasq, with the missing hostname injected

The dnsmasq decoder is the one that took the most plumbing. dnsmasq’s file-mode logging writes:

May  5 09:43:10 dnsmasq[20]: query[A] api.openai.com from 127.0.0.1

— with no hostname between the timestamp and the program name. Wazuh’s syslog pre-decoder expects:

May  5 09:43:10 waznet-endpoint dnsmasq[20]: query[A] api.openai.com from 127.0.0.1

— which dnsmasq won’t produce in either file mode (no hostname) or stderr mode (no timestamp because -k closes parent stdio to /dev/null). The lab solves this with a small Python sidecar that tails the raw dnsmasq log and rewrites each line with the hostname injected, into the file the Wazuh agent ingests. A previous attempt with a tail | awk pipe failed silently — mawk doesn’t honor POSIX {n} interval quantifiers without --re-interval, and stdbuf -oL doesn’t actually unbuffer awk’s writes when stdout is a regular file. The Python sidecar is forty lines and just works.

The egress and listener watchers, by-passing the Wazuh 4.10 `<full_command>` framing trap

The previous post documented a regression: Wazuh agent 4.10 wraps each periodic <full_command> snapshot as ONE multi-line event (full_log = "ossec: output: 'ps_full':\n<row>\n<row>..."). Per-row decoders that worked on 4.4 don’t claim the new framing, and several detection rules in that pack silently missed events as a result. The fix the previous post recommended was: skip <full_command> entirely and use sidecar shell scripts that emit one syslog-shaped line per row. This pack does that.

egress-watcher.sh runs in a 3-second loop, runs ss -Htnp state established, and emits one line per connection:

May 05 09:50:32 waznet-endpoint egress-watch[22]: event=conn local=172.20.0.3:44390 peer=162.159.140.245:443 proc=chromium pid=1043

The egress-watch-conn decoder pulls out egress_local_addr, egress_local_port, egress_peer_addr, egress_peer_port, egress_proc, egress_pid as named fields. Rules then narrow with <field> regex.

A 3-second poll cadence catches most TLS sessions (RTT plus handshake alone is usually >300 ms; a single-RTT request typically lasts ≥1 s). Sub-second flows still evade — the lab’s docs flag this and point at eBPF or conntrack as the real-world fix, but neither was needed to reach the 15-rule coverage target.

The browser-driver, with `prctl(PR_SET_NAME)` for honest comm attribution

The browser-driver is a Python script that supports four scenarios — install-extension, register-native-host, simulate-browse, burst-llm-dns. The clever part is set_proc_name, which calls prctl(PR_SET_NAME, name) via ctypes so that ss -tnp and ps see the process as chromium or firefox rather than python3. Without this, the egress watcher’s process-attribution rules can’t distinguish the driver’s connections from any other Python process.

def set_proc_name(name: str) -> None:
    PR_SET_NAME = 15
    libc = ctypes.CDLL(None)
    buf = ctypes.create_string_buffer(name.encode("utf-8"))
    libc.prctl(PR_SET_NAME, buf, 0, 0, 0)

This is also what an attacker would do to evade a process-attribution rule that enumerates bad process names — and one of the things the red-team review caught, which the rule pack now handles by inverting the logic (more on that below).

The 15 Rules: Every Detection That Matters

All fifteen rules live in the 100300–100340 ID range — well clear of stock Wazuh and clear of the previous pack’s 100200–100240 range. The full file is wazuh/rules/local_rules.xml. I’ll walk through every rule by category, with the rationale.

Anchors (100300, 100301, 100302, 100303)

Every dnsmasq query is anchored at level 3. Every browser-driver, egress-watch, and listener-watch event is similarly anchored. Subsequent rules narrow with <if_sid> plus a more specific match.

<rule id="100300" level="3">
  <decoded_as>dnsmasq</decoded_as>
  <field name="dns_qtype" type="pcre2">.</field>
  <description>DNS lookup observed: $(dns_qtype) $(dns_qname) from $(dns_client)</description>
  <group>dns,</group>
</rule>

The <field name="dns_qtype" type="pcre2">.</field> constraint matters. The dnsmasq-query sub-decoder sets dns_qtype; the dnsmasq-reply sub-decoder does not. Without the constraint, the anchor fires on reply lines too, and the $(dns_qtype) $(dns_qname) from $(dns_client) description renders as “DNS lookup observed: from ” — three blanks, no values. Gating on dns_qtype keeps the anchor query-only, and the burst rule downstream avoids double-counting every lookup.

DNS-side detection (100310, 100311, 100312)

100310 fires on every DNS query for a known LLM provider host. Level 6 — informational, the building block for fleet inventory (“which endpoints talk to which providers”).

<rule id="100310" level="6">
  <if_sid>100300</if_sid>
  <field name="dns_qname" type="pcre2">(^|\.)(openai\.com|anthropic\.com|claude\.ai|generativelanguage\.googleapis\.com|gemini\.google\.com|cohere\.ai|mistral\.ai|perplexity\.ai|copilot\.microsoft\.com|api\.x\.ai|api\.deepseek\.com)$</field>
  <description>DNS query for LLM provider domain: $(dns_qname) (client=$(dns_client))</description>
  <mitre><id>T1071.004</id></mitre>
  <group>shadow_ai,dns,llm_provider,</group>
</rule>

The (^|\.) anchor is what stops a host like fakeopenai.com or api.openai.com.attacker.test from matching. The blue-team review verified this against both probes; both correctly fall through to the level-3 anchor only.

100311 fires on DNS queries for known DoH-bypass resolver hostnames — cloudflare-dns.com, nextdns.io, dns.google, dns.adguard.com, mozilla.cloudflare-dns.com, chrome.cloudflare-dns.com, and a dozen others. The list is structural: any endpoint resolving these is typically about to issue DNS-over-HTTPS, which bypasses the local resolver and destroys this whole pipeline’s visibility. The red-team review caught that an earlier 6-host literal allow-list missed NextDNS, AdGuard, and CleanBrowsing variants; the current list is broader.

100312 is the burst rule: ≥3 LLM provider DNS lookups within 60 s. Anchored on 100300 rather than on 100310, with the LLM-host filter re-applied inline. The reason for not chaining off 100310 is subtle but important: Wazuh emits only the highest-severity matching rule per event in alerts.json, and once 100340 (the chain rule, level 14) is hot, every 100310 fire is published as a 100340 instead. Chains via <if_matched_sid>100310</if_matched_sid> start under-counting once that supersession kicks in. Anchoring on 100300 plus an inline LLM-host filter sidesteps the trap; the dns_qtype filter (also enforced on 100300) ensures we count only query[A] lines, not the matching reply lines that would otherwise double-count every lookup.

Egress-side detection (100320, 100321, 100322)

100320 fires on egress to a known LLM provider IP from any process. Level 8 — the building block for the more specific rules.

The IP list is necessarily approximate. Anthropic uses its own /24s (160.79.104.0/24, 160.79.105.0/24); OpenAI fronts via Cloudflare (172.64.0.0/13, 162.159.0.0/16 broadly); Google Gemini answers from any GFE prefix (142.250.0.0/15, 142.251.0.0/16, 172.217.0.0/16, 34.96.0.0/16); Cohere from AWS (18.244.0.0/16); Mistral from GCP; Perplexity from Cloudflare again. Refresh quarterly. Using shared CDN ranges means false positives on non-AI traffic to the same CDNs — for production use, replace with a CDB list of curated IP ranges resolved from your provider hostnames.

100321 is the headline egress rule:

<rule id="100321" level="12">
  <if_sid>100320</if_sid>
  <field name="egress_proc" negate="yes" type="pcre2">^(chromium|chromium-browser|chrome|google-chrome|firefox|firefox-esr|brave|brave-browser|microsoft-edge|edge|edge-beta|opera|safari|webkit|epiphany|vivaldi|code|cursor|jetbrains-.*|idea|pycharm|webstorm|goland|nvim|vim|emacs|insomnia|postman|warp|wezterm|kitty)$</field>
  <description>Non-browser/IDE process talking to LLM provider IP: $(egress_proc) (pid=$(egress_pid)) -> $(egress_peer_addr):$(egress_peer_port)</description>
  <mitre><id>T1071.001</id><id>T1567</id></mitre>
  <group>shadow_ai,egress,llm_provider,attack,</group>
</rule>

This is inverted from the obvious shape. The first iteration enumerated bad process names — ^(python3?|node|curl|wget|ruby|perl|java|go)$ — and fired on those. The red-team agent broke it in seconds: prctl(PR_SET_NAME, "systemd-resolve") and the rule went silent. Renaming the connecting process to brave did the same. So the rule was rewritten as an allow-list of expected browser and IDE comms with negate="yes". Anything outside the allow-list talking to an LLM IP is suspicious by default, and the only way to evade is to rename to one of a handful of known-good comms — which is far more conspicuous in ps output and far less defensible to incident response.

100322 is the sustained-egress rule: ≥3 connections to LLM provider IPs within 120 s. Frequency rule, anchored on 100320. Single connection is informational; sustained activity is paging.

Browser-side detection (100330, 100331, 100332, 100333)

100330 fires on FIM new-file events under any of the four watched directories — Chrome extensions, Firefox extensions, Chrome native-messaging hosts, Firefox native-messaging hosts. The rule narrows by literal path match on the FIM event’s full_log:

<rule id="100330" level="9">
  <if_sid>554</if_sid>
  <match type="pcre2">/(\.config/google-chrome/Default/Extensions|\.mozilla/firefox/[^/]+/extensions|\.config/google-chrome/NativeMessagingHosts|\.mozilla/native-messaging-hosts)/</match>
  <description>New file under a browser extension / native-messaging dir: $(file)</description>
  <mitre><id>T1176</id></mitre>
  <group>shadow_ai,browser,fim,</group>
</rule>

Path match rather than <field name="syscheck.tags"> because the dotted-field syntax for syscheck array fields varies between Wazuh versions; the path is stable. There is also a subtle ossec.conf trap that costs hours if you don’t know about it: <directories> paths must be on the same line as the opening tag. A multi-line <directories> body silently registers as an empty path (visible as Monitoring path: '' in ossec.log) and the entire FIM block does nothing. The lab’s ossec.conf has the rationale inline so nobody else has to debug it.

100331 fires on browser-driver events whose payload is an extension_install declaring an LLM provider host in either host_permissions (MV3) or permissions (MV2). Level 13. The early version matched only the MV3 key; the red team caught it instantly with an MV2-style manifest and the rule was widened.

100332 fires on native_host_register events whose payload contains an AI/LLM keyword in the name, description, or path field. Crucially, the keyword judgement is performed at rule time, not on a "ai_keyword": true boolean the driver decided. Doing it driver-side meant a real attacker only had to pick a name without the magic substrings — com.x.ml_helper, com.x.copilot_assistant, com.x.bridge — to slip through. The rule’s PCRE2 covers a much wider keyword set: ai, llm, gpt, chatgpt, claude, gemini, openai, anthropic, copilot, assistant, bridge, helper, ml_host, mistral, perplexity, cohere, deepseek.

100333 is the wildcard-permission catcher. Any extension that declares <all_urls> or an https://*/* wildcard in its host permissions can talk to any LLM provider tomorrow without being recognizable today by the explicit-host detector (100331). MED severity because legitimate extensions also use wildcards (password managers, dev tools), but worth surfacing on a developer endpoint.

The chain rule (100340)

This is the headline detection.

<rule id="100340" level="14" frequency="2" timeframe="300">
  <if_matched_sid>100331</if_matched_sid>
  <if_sid>100300</if_sid>
  <field name="dns_qtype" type="pcre2">.</field>
  <field name="dns_qname" type="pcre2">(^|\.)(openai\.com|anthropic\.com|claude\.ai|generativelanguage\.googleapis\.com|gemini\.google\.com|cohere\.ai|mistral\.ai|perplexity\.ai|copilot\.microsoft\.com|api\.x\.ai|api\.deepseek\.com)$</field>
  <description>Browser shadow-AI chain: extension install + LLM-provider DNS within 5 min ($(dns_qname))</description>
  <mitre><id>T1176</id><id>T1071.004</id></mitre>
  <group>shadow_ai,chain,attack,</group>
</rule>

A browser extension with LLM host_permissions was installed (100331). Within five minutes, dnsmasq logs a query for one of the LLM hosts. Each step in isolation is innocuous; the join is the activation footprint of an actively-used shadow-AI extension. Chained on 100300 plus inline LLM-host filter for the same supersession-resilience reasons as 100312.

The trigger script triggers/06-browser-shadow-ai-chain.sh writes an LLM-permissioned manifest, fires a DNS burst against the same providers, and opens a TLS session to one of them — all within 30 seconds. 100340 fires reliably.

Live Validation: 185 Alerts, All 15 Rules

triggers/00-fire-all.sh end-to-end against the bundled manager on a clean state produces 185 alerts in rule.id ∈ [100300, 100340] in a single run, across all 15 distinct rule IDs:

Rule	Lvl	Fires	Trigger	What
100300	3	4	01	Anchor — baseline DNS query
100301	3	48	02–06	Anchor — every browser-driver event
100302	3	62	03/06	Anchor — every egress-watch row
100303	3	12	any	Anchor — every listener-watch row
100310	6	18	02/06	DNS query for LLM provider domain
100311	12	4	05	DoH-bypass resolver lookup
100312	12	1	02	DNS burst — ≥3 LLM lookups in 60 s
100320	8	3	06	Egress to LLM provider IP
100321	12	12	03	Non-browser/IDE process to LLM provider IP
100322	14	1	03	Sustained egress — ≥3 LLM connects in 120 s
100330	9	8	04/06	FIM new file under browser-ext / NMH dir
100331	13	2	04/06	Extension declares LLM `host_permissions`
100332	12	1	04	Native messaging host registered with AI keyword
100333	10	1	04	Extension declares wildcard `host_permissions`
100340	14	8	06	Chain — extension install + LLM DNS within 5 min

A representative 100340 alert (the headline chain):

{
  "timestamp": "2026-05-05T11:07:05.255+0000",
  "rule": {
    "level": 14,
    "description": "Browser shadow-AI chain: extension install + LLM-provider DNS within 5 min (api.openai.com)",
    "id": "100340",
    "mitre": {
      "id": ["T1176", "T1071.004"],
      "tactic": ["Persistence", "Command and Control"],
      "technique": ["Browser Extensions", "DNS"]
    },
    "frequency": 2,
    "groups": ["shadow_ai", "network", "shadow_ai", "chain", "attack"]
  },
  "agent": {"id": "001", "name": "waznet-endpoint"},
  "decoder": {"name": "dnsmasq"},
  "data": {"dns_qtype": "A", "dns_qname": "api.openai.com", "dns_client": "127.0.0.1"},
  "full_log": "May  5 11:07:04 waznet-endpoint dnsmasq[21]: query[A] api.openai.com from 127.0.0.1",
  "location": "/var/log/lab/dns-queries.log"
}

Tail /var/ossec/logs/alerts/alerts.json on the manager while the trigger script runs to capture the same set on your replay.

Live on the Wazuh dashboard

Reproduced end-to-end on Wazuh 4.14.5 (manager + indexer + dashboard) — all 15 rule IDs fire. Filtering the Threat Hunting module to rule.groups:network shows the alert volume, the level-12-or-above count, and the MITRE ATT&CK spread the pack maps to — DNS (T1071.004), Browser Extensions (T1176), Web Protocols (T1071.001), Exfiltration Over Web (T1567), and Protocol Tunneling (T1572):

Wazuh Threat Hunting dashboard filtered to the network shadow-AI rule pack

Dropping the level-3 anchors (rule.groups:network and not rule.level:3) surfaces the substantive detections — DNS-over-HTTPS bypass, non-browser/IDE egress to provider IPs, sustained egress, extension installs declaring LLM host_permissions, AI-themed native messaging hosts, and the cross-source chain:

Wazuh events table showing the network shadow-AI detections firing

Red-Team and Blue-Team Pass

I ran two specialized agents against the lab before publishing this post — one red-team, one blue-team. Both produced concrete findings that shaped the final rule pack.

The red-team agent ran the canonical triggers and then tried 12 evasion variants. Evasions that worked against the first iteration of the rule pack:

DNS over plain HTTP / port 80 to LLM IPs evaded the ^443$ port-pinned 100320 (rule was rewritten to drop the port pin).
prctl(PR_SET_NAME, "systemd-resolve") evaded 100321’s “non-browser process” block-list (rule was inverted to an allow-list of expected browsers/IDEs).
MV2-style permissions keys evaded 100331’s host_permissions-only match (regex was widened).
Native messaging host names without the magic substrings evaded 100332’s driver-side "ai_keyword": true check (judgement moved into the rule via PCRE2 over name/description/path).
DoH providers not in the original 6-host literal list (NextDNS, AdGuard, CleanBrowsing, Cloudflare’s chrome. and security. subdomains) evaded 100311 (list was broadened, with structural reasoning about why each was added).

After the fixes, the blue-team agent did an independent pass — coverage verification, field-extraction checks, MITRE-ID sanity, false-positive probing, and wazuh-logtest verification of every decoder. The blue-team flagged two follow-ups: the 100300 anchor’s blank description on reply/forwarded lines (fixed by gating on dns_qtype), and the original 100312 semantic ambiguity (resolved by anchoring on 100300 plus inline filter rather than chaining off 100310). The current pack passes the blue-team checklist with one acknowledged caveat — the 104.16.0.0/12 Cloudflare prefix in the LLM IP list is necessarily aliased with non-AI Cloudflare traffic, and the rule comment flags it as something to replace with a CDB list in production.

What the red team confirmed was not evadable in the current pack:

Anchored regex on (^|\.)openai\.com$ correctly rejects fakeopenai.com, notopenai.com, api.openai.com.attacker.test.
Process-allow-list inversion in 100321 catches arbitrary prctl rename evasions.
Wildcard host permissions are caught by 100333 even when no LLM hostname is in the manifest.
Native messaging AI keywords are caught at rule time, so renaming the manifest doesn’t help.

What is still evadable, and documented as such in the lab’s known-limitations section:

True DoH bypass (e.g. curl --doh-url) where the LLM hostname never traverses the local resolver — only the egress-side rules can catch the eventual TLS connection, and that only works if the destination IP is in the curated list.
Sub-second TLS sessions that complete inside one ss -tnp poll interval — the lab notes this and points at eBPF (bcc tcpconnect) or conntrack as the production fix.
IP-only operation (no DNS lookup at all) — defeats the chain rule entirely and survives 100310/100312/100340. This is a structural limit of the chosen telemetry shape.

Reproduction

git clone https://github.com/nadimjsaliby/wazuh-network-shadow-ai.git
cd wazuh-network-shadow-ai

docker compose down -v          # clean baseline
docker compose up -d --build    # boots manager + endpoint, agent auto-enrolls

# wait for the agent to show Active on the manager
until docker exec waznet-manager /var/ossec/bin/agent_control -l \
        | grep -q 'waznet-endpoint.*Active'; do sleep 2; done

# fire every trigger
docker exec -u labuser -it waznet-endpoint bash /opt/lab/triggers/00-fire-all.sh

# observe alerts
docker exec waznet-manager bash -c \
  "tail -F /var/ossec/logs/alerts/alerts.json" \
  | jq -c 'select(.rule.id|tonumber>=100300 and tonumber<=100399)'

To deploy the rules against an existing Wazuh manager instead of the bundled one:

scp wazuh/decoders/local_decoder.xml manager:/var/ossec/etc/decoders/
scp wazuh/rules/local_rules.xml      manager:/var/ossec/etc/rules/
ssh manager '/var/ossec/bin/wazuh-control restart'

WAZUH_MANAGER=<your-manager> docker compose up -d --build endpoint

You’ll also need to ship something equivalent to dnsmasq + the egress watcher to the agent endpoints — the rules expect those log streams. For deployments without a local resolver, swap the dnsmasq decoder anchor for whatever DNS source you do have (auditd, Zeek, unbound, BIND query log) and adjust the program_name. The rest of the rules consume the named fields the decoder produces, not the specific log format.

Why Wazuh

I picked Wazuh as the engine for this work for the same reasons I picked it for the previous rogue-MCP post: it’s the only open-source platform that combines the four data sources this detection logic needs in a single agent binary, with cross-rule correlation at the manager tier and no separate stream-processing layer.

Specifically:

Localfile / syslog ingestion with custom decoder support, so dnsmasq’s query log, the browser-driver’s structured events, and the egress watcher’s per-connection rows can all feed the rule engine without a sidecar shipper.
File Integrity Monitoring that fires on the well-known browser extension and native-messaging-host paths, with the tags attribute that 100330 re-tags through.
Cross-rule correlation via <if_sid> / <if_matched_sid> and timeframe joins, so 100340 (extension install + LLM DNS within 5 min) and 100312 (DNS burst within 60 s) and 100322 (sustained egress within 120 s) are all expressible without a stream processor or a separate SIEM correlation tier.
Manager-side rule processing, so the entire rule pack lives on one host and can be hot-reloaded with wazuh-control restart without touching the endpoints.

Falco and Sysmon-on-Linux do runtime detection well but neither has the localfile ingestion path that the dnsmasq query log and the browser-driver trace need. Auditd is overkill for this threat model and demands privileged container operations the realistic deployment shape can’t grant. Custom shippers feeding Elastic or Splunk would close the gap, but at the cost of running another agent on the endpoint and another pipeline in the SIEM. Wazuh does it with one daemon and one configuration file.

This pack is the network-side companion to wazuh-shadow-ai. The two together cover the agentic-coding side and the browser side of shadow AI on developer endpoints, with no overlap and no gaps in the surface area I could find. Drop both rule files into your manager, run both labs end-to-end against your environment, and the next time someone asks “are we monitoring shadow AI?”, you have a concrete answer with alerts to back it up.

Resources

Wazuh — the open source security platform — unified XDR and SIEM protection for endpoints and cloud workloads
Wazuh Ambassadors Program — the community program this work was produced for
Repository + lab — nadimjsaliby/wazuh-network-shadow-ai — clone and docker compose up
Companion endpoint pack — nadimjsaliby/wazuh-shadow-ai — rogue MCP / IDE-side shadow AI
Companion blog post — Detecting Rogue MCP Servers and Shadow AI Agents on Endpoints with Wazuh
Wazuh Documentation — Custom Rules and Decoders
Wazuh Documentation — File Integrity Monitoring
Wazuh Documentation — Localfile log collection
dnsmasq man page — log-queries / log-facility
Chrome Extension Manifest V3 — host_permissions
Chrome Native Messaging — host manifest
MITRE ATT&CK — T1071.001 Application Layer Protocol: Web Protocols
MITRE ATT&CK — T1071.004 Application Layer Protocol: DNS
MITRE ATT&CK — T1176 Browser Extensions
MITRE ATT&CK — T1567 Exfiltration Over Web Service
MITRE ATT&CK — T1572 Protocol Tunneling
OWASP — LLM Application Top 10 (2025)

This post was produced for the Wazuh Ambassadors Program. Wazuh is a free, open source security platform.