When pid 1 crashes, the getty unit for the console will happily keep
running which means we end up with two shells competing for the same
tty. Let's call vhangup on /dev/console to kill every other process
attached to the console before we spawn the crash shell. The getty
units have Restart=always but lucky for us, pid 1 just crashed in fire
and flames so it isn't actually able to restart the getty unit.
gcc15 has -Wunterminated-string-initialization in -Wextra and
warns about string constants that are not null terminated even though
the functions do do out of bounds access.
Silence the warnings by simply not providing an explicit size.
The s390x platform provides confidential VMs using the "Secure Execution"
technology, which is also referred to as "Protected Virtualization" or
just "prot virt" in Linux / QEMU.
This can be detected through a simple sysfs attribute.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We have different impls of detect_confidential_virtualization per
architecture. The detection is cached in the x86_64 impl, and as we
add support for more targets, we want to use caching for all. It thus
makes sense to split caching out into an architecture independent
method.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
cg_kill_kernel_sigkill() has a narrow use case, and currently
no code really reaches that branch. Let's detach it from
cg_kill_recursive() hence, and call it explicitly later
where appropriate.
When removing a cgroup, we always want to eliminate subcgroups
first, i.e. use cg_trim(). And cg_rmdir() (along with
CGROUP_REMOVE flag) is simply unused. Kill it.
All messages logged from exec_spawn() are attributed to the unit
and as such we should set the log level to the unit's max log level
for the duration of the function.
The original CVM detection logic for TDX assumes that the guest can see
the standard TDX CPUID leaf. This was true in Azure when this code was
originally written, however, current Azure now blocks that leaf in the
paravisor. Instead it is required to use the same Azure specific CPUID
leaf that is used for SEV-SNP detection, which reports the VM isolation
type.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Similar to the implementation of cgroup.kill in the kernel, let's
skip kernel threads in cg_kill_items() as trying to kill kernel
threads as an unprivileged process will fail with EPERM and doesn't
do anything when running privileged.
Currently inhibitors are bypassed unless an explicit request is made to
check for them, or even in that case when the requestor is root or the
same uid as the holder of the lock.
But in many cases this makes it impractical to rely on inhibitor locks.
For example, in Debian there are several convoluted and archaic
workarounds that divert systemctl/reboot to some hacky custom scripts
to try and enforce blocking accidental reboots, when it's not expected
that the requestor will remember to specify the command line option
to enable checking for active inhibitor locks.
Also in many cases one wants to ensure that locks taken by a user are
respected by actions initiated by that same user.
Change logind so that inhibitors checks are not skipped in these
cases, and systemctl so that locks are checked in order to show a
friendly error message rather than "permission denied".
Add new block-weak and delay-weak modes that keep the previous
behaviour unchanged.
Currently, IS_SYNTHETIC_ERRNO() evaluates to true for all negative errnos,
because of the two's-complement negative value representation.
Subsequently, ERRNO= is not logged for most of our own code.
Let's fix this, by formatting all synthetic errnos as positive.
Then, treat all negative values as non-synthetic.
While at it, mark the evaluation order explicitly, and remove
unneeded comment.
Fixes#33800
openat() will always resolve symlinks, except if O_NOFOLLOW is passed
or O_CREAT|O_EXCL is passed. This means that if a dangling symlink is
passed to openat_report_new(), the first call to openat() will always
fail with ENOENT and the second call to openat() will always fail with
EEXIST.
Let's catch this case explicitly and fallback to creating the file with
just O_CREAT and assume we're the ones that created the file. We can't
resolve the symlink with chase() because this function is itself called
by chase() so we could end up in weird recursive calls if we'd try to do
so.
We do this already in all string lookup tables. This way
it's guaranteed that iterators which ends with _NAMESPACE_TYPE_MAX
wouldn't overrun the array.
Various tty types come up with cols/rows not initialized (i.e. set to
zero). Let's detect these cases, and return a better error than EIO,
simply to make things easier to debug.
/dev/console is sometimes a symlink in container managers. Let's handle
that correctly, and resolve the symlink, and not consider the data from
/sys/ in that case.
It doesn't really make sense to have that in dev-setup.c, which is
mostly about setting up /dev/, creating device nodes and stuff.
let's move it to the other stuff that deals with /dev/console's
peculiarities.
Numerous fixes:
1. use vtnr_from_tty() to parse out VT number from tty path
2. open tty for write only when we want to output just ansi sequences
3. open tty in asynchronous mode, and apply a timeout, just to be safe
4. propagate error from writing (most callers ignore it anyway, might as
well pass it along correctly)
This is a lot of stuff, and sometimes quite wild, let's turn this into
its own header.
All stuff color-related that just generates sequences is now in
ansi-color.h (no .c file!), and everything more complex that
probes/ineracts with terminals remains in termina-util.[ch]