Wazuh on K8s: 7 Frameworks, Auto-Remediation, One Chart • Nadim Saliby

Most teams running Wazuh on Kubernetes stitch together five separate tools to get compliance coverage that still has gaps. One tool scans CIS benchmarks. Another handles admission policies. A third runs vulnerability checks. Remediation is manual. Reports are spreadsheets. None of them share context, and when an auditor asks whether a single kubelet misconfiguration also fails NIST, PCI, and HIPAA controls — nobody can answer without hours of cross-referencing.

I built a single Helm chart that wraps a standard Wazuh agent deployment and adds the missing Kubernetes-native pieces on top — admission-time enforcement, automated remediation, Prometheus-native observability, and scheduled compliance reports — so one helm install covers the full prevent → detect → fix → prove lifecycle. 167 SCA checks across 7 compliance frameworks, MITRE ATT&CK-mapped runtime detection, admission webhook, Prometheus metrics, Grafana dashboard, audit-ready compliance reports.

This post is a technical deep-dive. Because Wazuh is the engine doing most of the heavy lifting, I want to be explicit up front about what’s native Wazuh and what this chart adds on top — that boundary matters if you’re trying to understand which parts you’d get out of the box versus what the chart contributes.

What’s native Wazuh vs. what this chart adds

Capability	Origin
SCA scanning engine (evaluating policy YAML against hosts/containers)	Wazuh (native)
FIM (File Integrity Monitoring)	Wazuh (native)
Vulnerability detection	Wazuh (native)
Rootcheck, log collection, syscheck	Wazuh (native)
Active response framework (Wazuh’s built-in remediation hooks)	Wazuh (native)
`compliance:` field in SCA YAML (cross-reference per check)	Wazuh (native)
Agent ↔ Manager architecture, dashboard, API	Wazuh (native)
MITRE ATT&CK tagging on rules/alerts	Wazuh (native, via `mitre` field)
Bundled SCA policy file content — 167 checks across 7 frameworks	This chart
Bundled runtime threat policy YAML with MITRE-mapped detections	This chart (policy content fed to Wazuh’s SCA engine)
`ValidatingWebhookConfiguration` + admission policy engine	This chart — not part of Wazuh
Auto-remediation `CronJob` with host-level file/sysctl/SSH/auditd fixes	This chart (broader scope than Wazuh’s active response)
Compliance reporting `CronJob` → JSON / HTML / CSV → S3 / email	This chart
Prometheus metrics exporter sidecar + PrometheusRule alerts	This chart (Wazuh doesn’t ship a Prometheus exporter)
Grafana dashboard	This chart
K8s self-hardening (NetworkPolicy, PDB, seccomp, RBAC, cert-manager)	This chart
Helm packaging, values schema, lifecycle hooks	This chart

Short version: Wazuh’s SCA engine, FIM, vulnerability scanner, active response, and MITRE-tagged alerting are doing the detection and compliance work. This chart packages bundled policy content for 7 frameworks, plus the K8s-native pieces Wazuh doesn’t ship (admission webhook, Prometheus exporter, reporting CronJob, remediation CronJob, self-hardening manifests) into a single deploy.

The problem with the current state

Here’s what most teams are running today:

A Wazuh or Falco DaemonSet for detection
OPA/Gatekeeper or Kyverno for admission policies
A separate CIS scanner (kube-bench) as a CronJob
Manual remediation or Ansible playbooks triggered by humans
Compliance reports generated in spreadsheets by hand

These tools don’t share context. A CIS check that fails on the kubelet doesn’t automatically map to the NIST 800-53 control it satisfies. The admission webhook doesn’t know what the SCA scanner found. The remediation is always manual.

The result: compliance drift, audit fatigue, and a false sense of security.

Architecture: one chart, full lifecycle

                  ┌──────────────────────────┐
    Deploy ──────►│   Admission Webhook      │ ─── PREVENT  (chart-added)
                  │   Block before it runs   │
                  └────────────┬─────────────┘
                               │
                  ┌────────────▼─────────────┐
    Runtime ─────►│   Wazuh Agent DaemonSet  │ ─── DETECT   (Wazuh-native)
                  │   SCA + FIM + Vuln + RT  │
                  └────────────┬─────────────┘
                               │
                  ┌────────────▼─────────────┐
    CronJob ─────►│   Auto-Remediation       │ ─── FIX      (chart-added,
                  │   File perms, kernel,    │              complements Wazuh
                  │   SSH, auditd, modules   │              active response)
                  └────────────┬─────────────┘
                               │
                  ┌────────────▼─────────────┐
    Scheduled ───►│   Compliance Reports     │ ─── PROVE    (chart-added)
                  │   JSON / HTML / CSV      │
                  │   S3 upload + email      │
                  └──────────────────────────┘

Prevent (chart-added). A ValidatingWebhookConfiguration intercepts every pod, deployment, statefulset, daemonset, job, and cronjob at admission time. It blocks privileged containers, host namespace access, privilege escalation, :latest tags, missing required labels, and unauthorized registries. This is not a Wazuh feature — it’s a standard Kubernetes admission webhook this chart deploys alongside the Wazuh agents.

Detect (Wazuh-native, with chart-bundled policies). Wazuh agents run Wazuh’s SCA engine against seven policy files simultaneously. The SCA engine, the scheduling, the alert pipeline back to the manager — all native Wazuh. What the chart contributes here is the policy content: 167 checks written in Wazuh’s SCA YAML format, with compliance: cross-references that Wazuh evaluates and tags on every finding.

Fix (chart-added CronJob). Wazuh has a built-in active response framework for triggering reactions to alerts (block IPs, kill processes, etc.). This chart adds a separate, broader-scope CronJob that runs every 6 hours (configurable) and remediates infrastructure findings the active response framework isn’t designed for: file permissions, kernel sysctl, SSH hardening, unused kernel modules, auditd rules. It starts in dry-run mode by default.

Prove (chart-added CronJob). A weekly CronJob queries the Wazuh manager API (native), aggregates findings into JSON, HTML, and CSV reports, and ships them to S3 or email. The query is against Wazuh’s API; the report shaping, multi-framework breakdown, and delivery are chart logic.

Seven frameworks, one scan

This is where the compliance: field in Wazuh’s SCA YAML does the heavy lifting. The mechanism is native Wazuh; the content (which controls map to which checks) is what the chart bundles.

Framework	Controls	What it checks
CIS Kubernetes v1.8.0	31 (L1 + L2)	API server config, kubelet hardening, etcd security, RBAC, network policies
CIS Linux v2.0.0	36 (L1 + L2)	Filesystem, network params, SSH, logging, file permissions, password policy
NIST 800-53 Rev5	24	AC, AU, CM, IA, SC, SI control families mapped to K8s and OS checks
PCI-DSS v4.0	20	Network segmentation, encryption at rest/transit, access control, FIM, audit trails
HIPAA §164.312	16	Access control, audit controls, integrity, authentication, transmission security
SOC2 Type II	18	CC6–CC8 trust criteria, availability, change management
Runtime Threats	22	MITRE ATT&CK mapped: cryptomining, container escape, reverse shells, persistence

Total: 167 checks per scan cycle.

The critical detail is cross-framework mapping via the native compliance: field. Take kubelet anonymous authentication:

# Excerpt from the chart's bundled CIS Kubernetes policy file —
# fed to Wazuh's SCA engine, which evaluates the rules and emits the
# compliance metadata on every resulting finding.
- id: 50400
  title: "IA-2: kubelet anonymous auth disabled"
  description: "The kubelet should reject anonymous requests..."
  rationale: "Anonymous authentication of the kubelet API permits any unauthorized caller..."
  remediation: "Set --anonymous-auth=false in kubelet config and restart."
  compliance:
    - cis_kubernetes: ["4.2.1"]
    - cis: ["4.2.1"]
    - nist_800_53: ["IA-2"]
    - pci_dss: ["2.2.1"]
    - hipaa: ["164.312(a)(2)(i)"]
    - soc2: ["CC6.1"]
  condition: any
  rules:
    - 'f:/var/lib/kubelet/config.yaml -> r:authentication: -> r:anonymous: -> r:enabled: false'

When this check fails, Wazuh’s SCA engine emits a single finding with all six compliance tags attached. One scan, one alert, six frameworks accounted for — that’s not custom mapping logic; that’s Wazuh’s native SCA behavior, fed with policy content that has the cross-references encoded.

Here’s what the live data looks like. The snippet below is the actual JSON returned by GET /sca/{agent_id} against a Wazuh manager (4.14.5) running this chart on a single-node K8s cluster, showing four of the bundled policies loaded simultaneously on one agent:

{
  "data": {
    "affected_items": [
      {
        "policy_id": "runtime_threat_detection",
        "name": "Runtime Threat Detection — Kubernetes Behavioral Analysis",
        "total_checks": 21, "pass": 5, "fail": 0, "invalid": 16,
        "score": 100,
        "start_scan": "2026-06-02T17:38:39+00:00",
        "end_scan":   "2026-06-02T17:38:39+00:00"
      },
      {
        "policy_id": "nist_800_53_k8s",
        "name": "NIST 800-53 Rev5 — Kubernetes Technical Controls",
        "total_checks": 24, "pass": 2, "fail": 2, "invalid": 20,
        "score": 50
      },
      {
        "policy_id": "cis_linux_l1",
        "name": "CIS Distribution Independent Linux Benchmark v2.0.0 - Level 1",
        "total_checks": 26, "pass": 0, "fail": 2, "invalid": 24,
        "score": 0
      },
      {
        "policy_id": "cis_amazon_linux_2023",
        "name": "CIS Benchmark for Amazon Linux 2023 Benchmark v1.0.0.",
        "total_checks": 183, "pass": 50, "fail": 45, "invalid": 88,
        "score": 52
      }
    ]
  }
}

That’s four chart-bundled policies evaluated by the same Wazuh SCA engine, in the same scan cycle, against the same agent. The invalid count comes mostly from checks gated on sca.remote_commands=yes (more on that below). The score and per-policy totals are exactly what the Wazuh dashboard renders in Configuration Assessment → Inventory:

SCA inventory view — all four chart-bundled policies loaded on the lab agent, with the exact pass/fail/invalid/score numbers from the API response above

Runtime threat detection: MITRE ATT&CK mapped

Wazuh’s SCA engine isn’t only for compliance baselines — it can evaluate any rules-based policy. This chart ships a runtime threats policy file (chart-added content) that uses the same SCA engine (Wazuh-native) to look for indicators of compromise, with each check tagged with the relevant MITRE ATT&CK technique.

The detection mechanism — process inspection (p:), port checks (c:+regex), command execution rules (c:), file inspection (f:) — is all native to the Wazuh SCA YAML format. What’s chart-added is the curated set of 22 detection rules and their MITRE mappings.

Below is what the policy and a real evaluated check look like. The first block is the source YAML for check 90100 shipped by the chart; the second block is the exact JSON the Wazuh API returns for that check after a scan against the live lab agent — note that Wazuh’s SCA engine has lifted the MITRE technique into the compliance array on the finding, and reproduced every individual p: rule with its match type. That’s the native engine doing the work; the chart provided the content.

# policies/runtime/runtime_threat_detection.yml — bundled by the chart
- id: 90100
  title: "Cryptominer process detection — xmrig"
  description: "Detect running xmrig or known cryptocurrency mining processes."
  rationale: "Cryptojacking is the most common attack in Kubernetes. MITRE ATT&CK T1496."
  remediation: "Kill the mining process. Investigate the container image and deployment source. Rotate compromised credentials."
  compliance:
    - mitre_attack: ["T1496"]
  condition: none
  rules:
    - "p:xmrig"
    - "p:minerd"
    - "p:cpuminer"
    - "p:minergate"
    - "p:ethminer"
    - "p:cgminer"
    - "p:bfgminer"
    - "p:nbminer"
    - "p:t-rex"
    - "p:gminer"
    - "p:lolminer"

// GET /sca/{agent_id}/checks/runtime_threat_detection?q=id=90100
{
  "policy_id": "runtime_threat_detection",
  "id": 90100,
  "title": "Cryptominer process detection — xmrig",
  "description": "Detect running xmrig or known cryptocurrency mining processes.",
  "rationale": "Cryptojacking is the most common attack in Kubernetes. MITRE ATT&CK T1496.",
  "remediation": "Kill the mining process. Investigate the container image and deployment source. Rotate compromised credentials.",
  "process": "xmrig,minerd,cpuminer,minergate,ethminer,cgminer,bfgminer,nbminer,t-rex,gminer,lolminer",
  "condition": "none",
  "result": "passed",
  "compliance": [
    { "key": "mitre_attack", "value": "T1496" }
  ],
  "rules": [
    { "rule": "p:xmrig",    "type": "process" },
    { "rule": "p:minerd",   "type": "process" },
    { "rule": "p:cpuminer", "type": "process" },
    { "rule": "p:nsenter",  "type": "process" }
    /* ... 7 more rules elided ... */
  ]
}

If a miner process were present at scan time, the only field that changes is "result": "failed" — and a wazuh-rule event also flows through the alert pipeline with the same MITRE tag attached.

The full check coverage and an operational caveat

The runtime policy contains 22 checks spanning seven attacker techniques. After a scan, the agent reports one of three outcomes per check:

Outcome	What it means
`passed`	The check’s `condition` is satisfied (e.g., `condition: none` and no matching process was found)
`failed`	The check matched — an indicator of compromise is present
`not applicable`	The check is gated on a runtime feature the agent isn’t allowed to use

The 22 checks break down by detection primitive:

Primitive	Checks	Examples
`p:` (process inspection)	5	xmrig miners, `nsenter`, reverse-shell tools (`nc -e`, `socat TCP`), `nmap`/`masscan`, DNS-tunnel tools (`iodine`, `dnscat`)
`c:` (command execution + regex)	17	stratum-port detection via `ss -tnp`, host-mount checks via `mount`, cgroup escape via `find /sys/fs/cgroup`, SA-token theft via `grep -l serviceaccount`, cloud-metadata calls via `ss`, cron/systemd persistence via `find -mmin -60`, log-tampering via `find -name *.log -empty`

Operational caveat — and this matters. The 17 c: checks evaluate by running shell commands inside the agent and matching their output against a regex. Wazuh disables this by default (sca.remote_commands=no in the agent’s internal_options.conf) because letting a remote-managed policy execute arbitrary shell on every node is an attack surface in itself. With the default setting, a live agent will return checks like this in the API:

{
  "policy_id": "runtime_threat_detection",
  "id": 90201,
  "title": "Container escape — Host mount abuse",
  "command": "mount",
  "condition": "none",
  "result": "not applicable",
  "reason": "Ignoring check for running command 'mount'. The internal option 'sca.remote_commands' is disabled",
  "compliance": [ { "key": "mitre_attack", "value": "T1611" } ],
  "rules": [
    { "rule": "c:mount -> r:containerd.sock", "type": "command" },
    { "rule": "c:mount -> r:docker.sock",     "type": "command" }
  ]
}

To enable the full 22-check coverage, the operator has to opt in by adding sca.remote_commands=yes to /var/ossec/etc/local_internal_options.conf on the agent (or via the chart’s agent.internalOptions override). The chart bundles the policy with the c: checks present but does not enable command execution for you — that’s a per-cluster security decision the operator has to make consciously. The 5 process-only checks (p:xmrig, p:nsenter, p:nc -e, p:nmap, p:iodine) work without the flag.

This is exactly the kind of native-Wazuh behavior that’s worth understanding before you deploy: Wazuh ships the safe default; the chart ships content that takes advantage of the unsafe-but-more-powerful option if you turn it on. Either way, the engine is Wazuh’s.

What lives in the policy

By MITRE technique:

T1496 — Resource Hijacking (cryptomining): 11 miner process names, stratum protocol ports (3333, 4444, 8333, 14444, 45700), CPU-saturation indicator
T1611 — Container escape: nsenter process, host-mount socket abuse, cgroup release_agent (CVE-2022-0492 style), privileged container detection
T1059 — Reverse shells: shell processes with socket redirects, ncat/nc with -e, socat TCP
T1552 — Credential access: SA-token reads from /proc, cloud-metadata endpoint, SSH-key scanning
T1053 / T1543 / T1554 — Persistence: recently created cron jobs, modified system binaries, new systemd service files
T1046 / T1613 — Reconnaissance: nmap/masscan/zmap processes, kubectl SA-token enumeration
T1048 — Exfiltration: DNS-tunnel tools (iodine, dnscat, dns2tcp), large outbound connections
T1070 — Defense evasion: truncated logs, cleared bash history

When any of these fire, the resulting Wazuh alert carries the MITRE technique ID natively — Wazuh’s rule engine attaches mitre.id, mitre.tactic, and mitre.technique automatically, so the alert is correlatable in Wazuh’s dashboard, in the API, and in any downstream SIEM.

The next screenshot is the Configuration Assessment view for the runtime_threat_detection policy on the live lab agent — Wazuh has loaded the chart-supplied YAML and is rendering all 21 checks with their evaluation results. Note the 5 passed / 0 failed / 16 not applicable counts (matching the API response above), and the mix of p:-rule checks (Passed — the rule evaluated and didn’t match) versus c:-rule checks (Not applicable — gated by sca.remote_commands):

Runtime Threat Detection policy view — 21 chart-supplied checks listed by ID with their MITRE-mapped titles, target rules, and per-check evaluation outcomes

Expanding the cryptominer row exposes the full check metadata Wazuh has parsed out of the chart’s YAML — the rationale, the remediation, the MITRE technique compliance tag, and every individual p: rule the engine evaluated:

Cryptominer check 90100 expanded — Wazuh shows the rationale, remediation, MITRE T1496 compliance tag, and all 11 p: rules from the chart's runtime_threat_detection.yml

What a real finding looks like end-to-end

Here are three actual failing checks captured from the lab agent — one each from the NIST 800-53, CIS Linux, and CIS Amazon Linux policies. These are not constructed examples; they’re the raw API response from GET /sca/{agent_id}/checks/{policy_id}?result=failed against the running deployment, lightly trimmed for length.

NIST 800-53 — IA-5: Password expiration not configured

{
  "policy_id": "nist_800_53_k8s",
  "id": 50401,
  "title": "IA-5: Authenticator Management — Password expiration configured",
  "description": "Ensure password maximum age is set to 365 days or less.",
  "rationale": "NIST IA-5 governs authenticator lifecycle management.",
  "remediation": "Set PASS_MAX_DAYS 365 in /etc/login.defs.",
  "file": "/etc/login.defs",
  "condition": "all",
  "result": "failed",
  "compliance": [
    { "key": "cis",         "value": "5.5.1.1" },
    { "key": "nist_800_53", "value": "IA-5" }
  ],
  "rules": [
    {
      "rule": "f:/etc/login.defs -> n:PASS_MAX_DAYS\\s+(\\d+) compare <= 365",
      "type": "file"
    }
  ]
}

Same underlying check, evaluated once, tagged with both CIS 5.5.1.1 and NIST 800-53 IA-5. That’s the cross-framework story the chart’s compliance metadata enables — an auditor asking “show me your NIST IA-5 evidence” and an auditor asking “show me your CIS L1 password policy evidence” both land on this same finding.

CIS Amazon Linux 2023 — AIDE intrusion detection missing

{
  "policy_id": "cis_amazon_linux_2023",
  "id": 31030,
  "title": "Ensure AIDE is installed.",
  "description": "Advanced Intrusion Detection Environment (AIDE) is a intrusion detection tool that uses predefined rules to check the integrity of files and directories in the Linux operating system...",
  "remediation": "Run the following command to install AIDE: # dnf install aide ...",
  "command": "rpm -q aide",
  "condition": "all",
  "result": "failed",
  "compliance": [
    { "key": "cis",                "value": "1.3.1" },
    { "key": "cis_csc_v8",         "value": "3.14" },
    { "key": "cmmc_v2.0",          "value": "AC.L2-3.1.7" },
    { "key": "hipaa",              "value": "164.312(b),164.312(c)(1),164.312(c)(2)" },
    { "key": "iso_27001-2013",     "value": "A.12.4.3" },
    { "key": "mitre_techniques",   "value": "T1565,T1565.001" },
    { "key": "nist_sp_800-53",     "value": "AU-2" },
    { "key": "pci_dss_v3.2.1",     "value": "10.2.1,11.5" },
    { "key": "pci_dss_v4.0",       "value": "10.2.1,10.2.1.1" },
    { "key": "soc_2",              "value": "CC6.1" }
  ],
  "rules": [
    { "rule": "c:rpm -q aide -> r:aide-", "type": "command" }
  ]
}

Ten frameworks satisfied (or failed, in this case) by one check running once. This is what the “one scan, many frameworks” claim looks like in practice — and again, the engine doing the tagging is Wazuh’s native SCA module; the chart provided the YAML.

CIS Linux L1 — Password minimum days

{
  "policy_id": "cis_linux_l1",
  "id": 30511,
  "title": "Ensure minimum days between password changes is configured",
  "file": "/etc/login.defs",
  "remediation": "Set PASS_MIN_DAYS 1 in /etc/login.defs.",
  "condition": "all",
  "result": "failed",
  "compliance": [
    { "key": "cis",        "value": "5.5.1.2" },
    { "key": "cis_csc_v8", "value": "5.2"    }
  ],
  "rules": [
    {
      "rule": "f:/etc/login.defs -> n:PASS_MIN_DAYS\\s+(\\d+) compare >= 1",
      "type": "file"
    }
  ]
}

Notice the same f: file-inspection primitive as the runtime policy uses — and the same compliance: field shape. Compliance scanning and runtime threat detection share the exact same Wazuh SCA engine; only the policy content differs.

Wazuh’s built-in framework dashboards pick up the compliance: tags directly. Here’s the NIST 800-53 dashboard for the lab agent, populated entirely by the chart-bundled NIST policy:

NIST 800-53 dashboard scoped to the lab agent — 508 alerts in the last 24 hours, broken down by control family (CM.1 dominates, with AC, AU, IA, SC controls also represented)

And the MITRE ATT&CK dashboard, populated by the mitre_attack tags the runtime policy emits:

MITRE ATT&CK dashboard for the lab agent — alerts grouped by tactic (Defense Evasion) and technique (Disable or Modify Tools), with the same agent.id=002 filter the API queries use

The admission webhook: shift-left enforcement

This is the part with no Wazuh involvement — it’s a standard Kubernetes ValidatingWebhookConfiguration this chart deploys alongside the agents. Detection (Wazuh’s job) is reactive; this webhook is proactive — it prevents non-compliant workloads from running in the first place.

It’s a separate HA deployment (default 2 replicas with topology spread constraints) with its own ServiceAccount, RBAC, NetworkPolicy, PDB, and cert-manager TLS certificate.

The policy engine evaluates 13 rules:

{
  "blockPrivileged": true,
  "blockHostNetwork": true,
  "blockHostPID": true,
  "blockHostIPC": true,
  "requireRunAsNonRoot": true,
  "blockPrivilegeEscalation": true,
  "blockLatestTag": true,
  "requireImageDigest": false,
  "requiredLabels": ["app.kubernetes.io/name", "app.kubernetes.io/version"],
  "blockedImageRegistries": [],
  "allowedImageRegistries": []
}

The webhook is self-hardened: non-root (UID 65534), read-only root filesystem, all capabilities dropped, seccomp RuntimeDefault, NetworkPolicy restricting traffic to only the API server. Failure policy defaults to Ignore (fail-open) so a webhook outage doesn’t block deployments — switchable to Fail for strict environments.

The exemption system is critical for production. The chart’s own namespace and service account are automatically exempted, along with kube-system, kube-public, and kube-node-lease. You can’t accidentally lock yourself out.

Auto-remediation: from detection to action

Wazuh ships active response, which lets you bind scripts to alerts and trigger them automatically. It’s powerful, but it’s optimized for event-driven, alert-by-alert responses (block IP, kill process, restart service).

This chart adds a separate, broader-scope CronJob that periodically applies host-level hardening that’s not a natural fit for active-response semantics. The two complement each other; the CronJob doesn’t replace active response.

The CronJob runs as privileged with host filesystem access and remediates:

File permissions — /etc/passwd to 644, /etc/shadow to 640, /etc/group to 644, /etc/gshadow to 640. For Kubernetes nodes, also enforces 600 permissions and root:root ownership on kube-apiserver.yaml, kube-controller-manager.yaml, kube-scheduler.yaml, etcd.yaml.

Kernel parameters — sysctl hardening, persisted to /etc/sysctl.conf:

net.ipv4.conf.all.send_redirects=0
net.ipv4.conf.default.send_redirects=0
net.ipv4.conf.all.accept_source_route=0
net.ipv4.conf.default.accept_source_route=0
net.ipv4.conf.all.accept_redirects=0
net.ipv4.conf.default.accept_redirects=0
net.ipv4.tcp_syncookies=1
net.ipv6.conf.all.accept_ra=0
net.ipv6.conf.default.accept_ra=0

SSH hardening — PermitRootLogin no, PermitEmptyPasswords no, MaxAuthTries 4, ClientAliveInterval 300, ClientAliveCountMax 3, LoginGraceTime 60.

Kernel modules — disables cramfs, squashfs, udf via /etc/modprobe.d/cis-hardening.conf.

Auditd rules — watch rules on /etc/passwd, /etc/shadow, /etc/group, /etc/gshadow.

Dry-run mode is essential. On first deployment, it logs every change it would make without touching anything:

[DRY-RUN] Would execute: chmod 640 /host/etc/shadow
[DRY-RUN] Would execute: sysctl -w net.ipv4.conf.all.send_redirects=0
[DRY-RUN] Would execute: echo 'PermitRootLogin no' >> /host/etc/ssh/sshd_config

When you’re ready to go live, flip autoRemediation.dryRun: false. It sends a Slack notification with a count of changes made per node.

Observability: Prometheus + Grafana

Wazuh has its own dashboard for browsing alerts and SCA results, and a REST API for programmatic access. What it doesn’t ship is a Prometheus exporter — so this chart adds a small metrics sidecar to each agent pod that scrapes the local agent’s status and queries the Wazuh manager API for per-node SCA / vulnerability / alert counts.

Six metrics exposed:

Metric	Type	Description
`wazuh_agent_up`	gauge	Is the agent process running (0/1)
`wazuh_sca_checks_passed`	gauge	Number of SCA checks currently passing
`wazuh_sca_checks_failed`	gauge	Number of SCA checks currently failing
`wazuh_fim_events_total`	counter	Total file integrity change events
`wazuh_vulnerabilities_detected`	gauge	Current vulnerability count
`wazuh_alerts_total`	counter	Total alerts generated

The chart ships a PrometheusRule with six alerts:

WazuhAgentDown — agent offline for 5+ minutes (critical)
WazuhHighSCAFailureRate — >30% of checks failing (warning)
WazuhCriticalSCAFailures — >50% of checks failing (critical)
WazuhVulnerabilitiesDetected — >50 vulnerabilities on a node (warning)
WazuhFIMSpikeDetected — unusual rate of file changes (warning)
WazuhAlertStorm — >50 alerts/sec indicating an active incident (critical)

The Grafana dashboard is auto-discovered via sidecar label and shows: agent status, compliance score gauge, SCA pass/fail per node, vulnerability trends, FIM event rate, alert rate with threshold highlighting.

For an at-a-glance correlation across alert sources, the Threat Hunting view aggregates everything the chart produces — SCA findings, ossec rules, rootcheck — and breaks them down by PCI-DSS requirement, rule group, and severity. The screenshot below is from the same lab agent the API snippets above were captured against (511 alerts in the last 24 h, top-5 rule group dominated by sca):

Threat Hunting overview scoped to the lab agent — 511 total alerts, top alert groups dominated by SCA + ossec + rootcheck, top PCI-DSS requirements 2.2 / 10.6.1 / 10.2.6 visualized

A Grafana dashboard sourced from this chart’s Prometheus exporter would render the same data in tile form — the screenshot above is the native Wazuh view; the chart’s Grafana panels (not pictured here) are a Prometheus-fed re-render of the same metrics.

Self-hardening: the chart secures itself

A security chart that isn’t itself hardened is a joke. The K8s-native hardening below is chart-added — Wazuh ships an agent, not Kubernetes deployment manifests:

NetworkPolicy — Agent pods can only reach the Wazuh manager, DNS, and the Kubernetes API. Webhook pods only accept traffic from the API server.
PodDisruptionBudget — Maintains 50% agent availability during rolling updates and node drains.
Seccomp — RuntimeDefault profile on all pods.
Secret management — Registration passwords stored in Kubernetes Secrets with helm.sh/resource-policy: keep. Supports external secret references.
Config checksums — DaemonSet pods auto-restart when ConfigMaps change. No manual rollout needed.
cert-manager integration — Webhook TLS and optional agent↔manager mTLS via cert-manager Certificates with ECDSA P-256 keys.
Manager HA — supports multiple manager endpoints with automatic failover (the failover list is configured here; the failover mechanism itself is in the Wazuh agent).
Values schema validation — JSON Schema catches misconfiguration before helm install runs.
Priority class — Agents run as system-node-critical so they’re the last thing evicted under resource pressure.

Deploying it

Minimal deployment with CIS + NIST enabled by default:

helm install wazuh-hardening ./wazuh-k8s-hardening \
  --namespace wazuh-system --create-namespace \
  --set manager.host=wazuh-manager.wazuh.svc.cluster.local \
  --set manager.registrationPassword=YOUR_PASSWORD

Once it’s up, the agent self-enrolls with the manager and appears in the Endpoints view — for the single-node lab this looks like:

Wazuh Endpoints page — the chart's DaemonSet has self-enrolled the agent (k8s-lab-templar, ID 002) into the kubernetes group, status Active, version 4.14.5, OS detected as Amazon Linux 2023

Full enterprise deployment:

global:
  clusterName: "prod-us-east-1"
  environment: "production"
  organization: "Your Org"

manager:
  host: "wazuh-manager.wazuh.svc.cluster.local"
  existingSecret: "wazuh-auth"
  failover:
    enabled: true
    hosts:
      - host: "wazuh-manager-2.wazuh.svc.cluster.local"

compliance:
  cisKubernetes:
    profile: "L2"
  cisLinux:
    profile: "L2"
  nist80053:
    enabled: true
  pciDss:
    enabled: true
  hipaa:
    enabled: true
  soc2:
    enabled: true

admissionWebhook:
  enabled: true
  failurePolicy: "Fail"

autoRemediation:
  enabled: true
  dryRun: false
  notifications:
    enabled: true
    slackWebhookUrl: "https://hooks.slack.com/services/..."

Why Wazuh

I chose Wazuh as the engine because it’s the only open-source platform that runs SCA, FIM, vulnerability detection, log collection, rootcheck, and active response from a single agent binary, with a manager that aggregates findings across every node into one queryable surface. Falco does runtime detection well but doesn’t do compliance scanning. kube-bench does CIS but nothing else. OPA does admission but doesn’t touch host-level hardening.

Wazuh’s SCA engine accepts custom YAML policy files with the compliance: field built in — which is the native capability that makes multi-framework cross-reference possible without writing a custom mapping engine. The MITRE ATT&CK fields on rules and alerts are also native — this chart just feeds the SCA engine policy content that uses them.

This chart extends Wazuh’s reach into the areas it doesn’t ship by default: admission-time enforcement, automated infrastructure remediation, Prometheus-native observability, scheduled multi-framework reporting, and K8s-native self-hardening. It’s a packaging story around Wazuh, not a replacement for Wazuh — the detection, scanning, and alerting engines are all Wazuh-native, and the chart’s value is bundling production-grade policy content, K8s integration, and tooling that complements them.

Part of the Wazuh Ambassadors Program

I built and published this as a member of the Wazuh Ambassadors Program — the community initiative that supports people building and sharing open-source security work on top of Wazuh. If this write-up or the chart was useful and you want to explore Wazuh yourself, you can reference my ambassador code AM-5597-622.