Skip to content

Kmsg channel closed #1174

@kerus1024

Description

@kerus1024

After deploying or rolling out deployment node-problem-detector, i encounter a kmsg channel closed error on some nodes.
As a result, kernel monitor based metrics are not collected on those nodes.
The affected nodes are not under any significant load, and there are essentially no kernel log messages being generated.

Environment

  • Ubuntu Jammy (kernel 5.15.0-x)
  • Ubuntu Noble (kernel 6.8.0-x)

log

I1030 06:41:19.605611       1 log_watchers.go:40] Use log watcher of plugin "kmsg"
<REDACTED>
I1030 06:41:19.605732       1 log_watchers.go:40] Use log watcher of plugin "filelog"
I1030 06:41:19.606512       1 k8s_exporter.go:54] Waiting for kube-apiserver to be ready (timeout 5m0s)...
I1030 06:41:19.614361       1 node_problem_detector.go:63] K8s exporter started.
I1030 06:41:19.614493       1 node_problem_detector.go:67] Prometheus exporter started.
I1030 06:41:19.614504       1 log_monitor.go:111] Start log monitor /custom-config/additional-filelog.json
I1030 06:41:19.614541       1 log_watcher.go:80] Start watching filelog
I1030 06:41:19.614549       1 log_monitor.go:111] Start log monitor /config/kernel-monitor.json
I1030 06:41:19.614613       1 log_monitor.go:236] Initialize condition generated: []
I1030 06:41:19.615573       1 log_monitor.go:111] Start log monitor /config/docker-monitor.json
I1030 06:41:19.615599       1 log_monitor.go:236] Initialize condition generated: [{Type:KernelDeadlock Status:False Transition:2025-10-30 06:41:19.615589877 +0000 UTC m=+0.055785295 Reason:KernelHasNoDeadlock Message:kernel has no deadlock} {Type:ReadonlyFilesystem Status:False Transition:2025-10-30 06:41:19.61558997 +0000 UTC m=+0.055785381 Reason:FilesystemIsNotReadOnly Message:Filesystem is not read-only}]
E1030 06:41:19.615656       1 log_watcher_linux.go:105] Kmsg channel closed
E1030 06:41:19.615696       1 log_monitor.go:137] Log channel closed: /config/kernel-monitor.json
I1030 06:41:19.619274       1 log_watcher.go:80] Start watching journald
I1030 06:41:19.619292       1 log_monitor.go:111] Start log monitor /config/systemd-monitor.json
I1030 06:41:19.619325       1 log_monitor.go:236] Initialize condition generated: [{Type:CorruptDockerOverlay2 Status:False Transition:2025-10-30 06:41:19.619318108 +0000 UTC m=+0.059513518 Reason:NoCorruptDockerOverlay2 Message:docker overlay2 is functioning properly}]
I1030 06:41:19.621986       1 log_watcher.go:80] Start watching journald
I1030 06:41:19.622011       1 log_monitor.go:111] Start log monitor /custom-config/additional.json
I1030 06:41:19.622120       1 log_monitor.go:236] Initialize condition generated: []
I1030 06:41:19.623024       1 problem_detector.go:76] Problem detector started
I1030 06:41:19.623053       1 log_monitor.go:236] Initialize condition generated: []
E1030 06:41:19.623115       1 log_watcher_linux.go:105] Kmsg channel closed
E1030 06:41:19.623138       1 log_monitor.go:137] Log channel closed: /custom-config/additional.json


reproduce

On the affected nodes, cat /dev/kmsg exits immediately

root@hostname:~# cat /dev/kmsg
6,2047,25658836,-;microcode: CPU 171: patch_level=0x0a0011d5
cat: /dev/kmsg: Broken pipe

What I've tried

  • Running dmesg -C to consume kmsg has no effect.
  • journalctl -kf is working normally.
  • running echo -n "kerustest" > /dev/kmsg can sometimes resolve the broken pipe problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions