Until now, using any form of seccomp while being unprivileged (User=)
resulted in systemd enabling no_new_privs.
There's no need for doing this because:
* We trust the filters we apply
* If User= is set and a process wants to apply a new seccomp filter, it
will need to set no_new_privs itself
An example of application that might want seccomp + !no_new_privs is a
program that wants to run as an unprivileged user but uses file
capabilities to start a web server on a privileged port while
benefitting from a restrictive seccomp profile.
We now keep the privileges needed to do seccomp before calling
enforce_user() and drop them after the seccomp filters are applied.
If the syscall filter doesn't allow the needed syscalls to drop the
privileges, we keep the previous behavior by enabling no_new_privs.
Sometimes it makes sense to hard kill a client if we die. Let's hence
add a third FORK_DEATHSIG flag for this purpose: FORK_DEATHSIG_SIGKILL.
To make things less confusing this also renames FORK_DEATHSIG to
FORK_DEATHSIG_SIGTERM to make clear it sends SIGTERM. We already had
FORK_DEATHSIG_SIGINT, hence this makes things nicely symmetric.
A bunch of users are switched over for FORK_DEATHSIG_SIGKILL where we
know it's safe to abort things abruptly. This should make some kernel
cases more robust, since we cannot get confused by signal masks or such.
While we are at it, also fix a bunch of bugs where we didn't take
FORK_DEATHSIG_SIGINT into account in safe_fork()
When a late error occurs in sd-executor, the cleanup-on-close of the
context structs happen, but at that time all FDs might have already
been closed via close_all_fds(), so a double-close happens. This
can be seen when DynamicUser is enabled, with a non-existing
WorkingDirectory.
Invalidate the FDs in the context structs if close_all_fds succeeds.
We use it for more than just pipe() arrays. For example also for
socketpair(). Hence let's give it a generic name.
Also add EBADF_TRIPLET to mirror this for things like
stdin/stdout/stderr arrays, which we use a bunch of times.
This is preparation for #28891, which adds a bunch more helpers around
"struct iovec", at which point this really deserves its own .c/.h file.
The idea is that we sooner or later can consider "struct iovec" as an
entirely generic mechanism to reference some binary blob, and is the
go-to type for this purpose whenever we need one.
Before the split, it made sense to assert, as checks were on setup.
But now these come from deserialization, and the fuzzer hits the
asserts, so simply return an error instead.
No functional changes, only moving code that is only needed in
exec_invoke, and adding new dependencies for seccomp/selinux/apparmor/pam
in meson for the sd-executor binary.