on 02 Apr 2024 08:00 AM
This problem often arises when a pod attempts to terminate but has open mounts (volumes, secrets, etc.) associated with it. Given that the OneAgent container mounts the entire host's root filesystem, it can prevent Kubernetes from unmounting these resources, leading to the pod being stuck.
Currently we exclude paths matching the following regular expressions:
/var/lib/kubelet/plugins/kubernetes\.io/csi/pv/pvc-.*/globalmount
/var/lib/kubelet/pods/.*/volumes/kubernetes\.io~(downward-api|empty-dir|csi|secret)
/var/lib/kubelet/pods/.*/volume-subpaths/.*
/run/netns/.*
/run/containerd/io\.containerd\.grpc\.v1\.cri/sandboxes/.*
If the default exclusions are insufficient for your setup, you can specify additional patterns for exclusion.
ONEAGENT_ADDITIONAL_UNMOUNT_PATTERN
environmental variable in the OneAgent container./mnt/additional_unmount_pattern
. This file should contain a single line of text with the regular expression for the paths you want to exclude.Hello, we have an issue where pods are unable to start within the timeout specified for the startup-probe.
There are a lot of log lines that state that files that match .*/volume-subpaths/.* regex are being unmounted on oneagent startup.
We have a support ticket open, but we are struggling to proceed.
15:48:41 Unmounting directories
15:48:42 Found paths matching pattern /var/lib/kubelet/plugins/kubernetes\.io/csi/pv/pvc-.*/globalmount|/var/lib/kubelet/pods/.*/volumes/kubernetes\.io~(downward-api|empty-dir|csi|secret)|/var/lib/kubelet/pods/.*/volume-subpaths/.*|/run/netns/.*|/run/containerd/io\.containerd\.grpc\.v1\.cri/sandboxes/.*:
Then we see:
15:50:16 15:50:15 Warning: /mnt/volume_storage_mount/host_root/home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/uuid/volume-subpaths/oneagent-share/discovery/10' failed, return code: 1, message: Error: command "umount /mnt/volume_storage_mount/host_root/home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/uuid/volume-subpaths/oneagent-share/discovery/10" does not match any entry on the allowlist or in additional unmount pattern
Has anyone else seen this problem?