"nsenter -a" doesn't migrate the specified process into the target
cgroup (it really should). Thus the cgroup will remain in a cgroup
that is (due to cgroup ns) outside our visibility. The kernel will
report the cgroup path of such cgroups as starting with "/../". Detect
that and print a reasonably error message instead of trying to resolve
that.
cg_kill_kernel_sigkill() has a narrow use case, and currently
no code really reaches that branch. Let's detach it from
cg_kill_recursive() hence, and call it explicitly later
where appropriate.
When removing a cgroup, we always want to eliminate subcgroups
first, i.e. use cg_trim(). And cg_rmdir() (along with
CGROUP_REMOVE flag) is simply unused. Kill it.
Similar to the implementation of cgroup.kill in the kernel, let's
skip kernel threads in cg_kill_items() as trying to kill kernel
threads as an unprivileged process will fail with EPERM and doesn't
do anything when running privileged.
Opening pidfds for non thread group leaders only works from 6.9 onwards with PIDFD_THREAD. On
older kernels or without PIDFD_THREAD pidfd_open() fails with EINVAL. Since we might read non
thread group leader IDs from cgroup.threads, we introduce and set CGROUP_NO_PIDFD to avoid
trying open pidfd's for them and instead use the pid as is.
This is just like cg_is_delegate() but operates on an fd instead of a
cgroup path.
Sooner or later we should access cgroupfs mostly via fds rather than
paths, but we aren't there yet. But let's at least get started.
This setting indicates that the given unit wants to receive coredumps
for processes that crash within the cgroup of this unit. This setting
requires that Delegate= is also true, and therefore is only available
where Delegate= is available.
This will be used by systemd-coredump to support forwarding coredumps to
containers.
systemd's own cgroup hierarchy is special to us, we use it to actually
manage processes. Because of that many calls tha apply to cgroups are
only ever called with the SYSTEMD_CGROUP_CONTROLLER as controller
argument. Let's hence remove the argument altogether.
This in particular touches the kill and xattr routines.
This changes no behaviour, we just drop an argument that is always set
to the same value anyway.
This is preparation to eventually getting rid of the cgroupvs1, because
on cgroupvs2 the cgroup paths do not change for different controllers,
there's only a single hierarchy there.
path_simplify_full()/path_simplify() are changed to allow a NULL path, for
which a NULL is returned. Generally, callers have already asserted before that
the argument is nonnull. This way path_simplify_full()/path_simplify() and
path_simplify_alloc() behave consistently.
In sd-device.c, logging in device_set_syspath() is intentionally dropped: other
branches don't log.
In mount-tool.c, logging in parse_argv() is changed to log the user-specified
value, not the simplified string. In an error message, we should show the
actual argument we got, not some transformed version.
Let's clean up validation/escaping of cgroup names. i.e. split out code
that tests if name needs escaping. Return proper error codes, and extend
test a bit.
IN C23, thread_local is a reserved keyword and we shall therefore
do nothing to redefine it. glibc has it defined for older standard
version with the right conditions.
v2 by Yu Watanabe:
Move the definition to missing_threads.h like the way we define e.g.
missing syscalls or missing definitions, and include it by the users.
Co-authored-by: Yu Watanabe <watanabe.yu+github@gmail.com>
From a given cgroup path, cg_path_get_unit() allows to retrieve the
unit's name. Although, this removes the path to the unit's cgroup,
preventing the result to be used to fetch xattrs.
Introduce cg_path_get_unit_path() which provides the path to the unit's
cgroup. This function behave similarly to cg_path_get_unit() (checking
the validity and escaping the unit's name).
../src/basic/cgroup-util.c: In function ‘skip_session’:
../src/basic/cgroup-util.c:1241:32: error: incompatible types when returning type ‘_Bool’ but ‘const char *’ was expected
1241 | return false;
The name "def.h" originates from before the rule of "no needless abbreviations"
was established. Let's rename the file to clarify that it contains a collection
of various semi-related constants.
After sending a SIGKILL to a process, the process might disappear from
`cgroup.threads` but still show up in `cgroup.procs` and still remains in the
cgroup and cause migrating new processes to `Delegate=yes` cgroups to fail with
`-EBUSY`. This is especially likely for heavyweight processes that consume more
kernel CPU time to clean up.
Fix this by only returning 0 when both `cgroup.threads` and
`cgroup.procs` are empty.