I find myself trying to log into a fresh ParticleOS VM started via
systemd-vmspawn all the time, but I don't know its CID. Let's show it on
the getty screen, to make it immediately visible.
We nowadays expose pidfdid at various places, e.g. envvars
and dbus properties. Also the sd_notify() MAINPID= message
has been complemented with MAINPIDFDID=. But acquiring
pidfdid is actually non-trivial especially considering
the 32-bit case, hence let's introduce a public helper
in sd-daemon specifically for that purpose.
This new tool looks for a three xattr on the root inode of a file system
that encode mount constraints of the file system. The tool is supposed
to be hooke into the mount logic and is supposed to protect against
misappropriating trusted file systems in unintended ways.
Consider the following scenario: we boot up on first boot and create a
tpm-locked pair of /var/ and /srv/ partitions via systemd-repart. An
attacker then offline modifies the partition table, exchanging the
metadata of the /var/ and /srv/ partition. So far we'd happily accept
that, honour the modified metadata and boot up. This could be used to
revert changes to /var/ or similar. And all that even though both
partitions are encrypted and locked to TPM!
With this new mechanism we can encode in the protected contents of the
file systems the ways it can be used: the partition type uuid, the
partition label and the intended mount point can be stored in xattrs,
and we can check them automatically on mount, and take action on
mismatch. (action would typically be immediate reboot).
This introduces a bunch of facilities:
1. The factory-reset.target unit that requests a factory reset is now
complemented by factory-reset-now.target that executes it at next
boot.
2. This latter is added to the initial transaction via the new trivial
systemd-factory-reset-generator.
3. A tool systemd-factory-reset has been added to query, request,
cancel, complete factory reset operations (via EFI variables). Two of
these are wrapped into units that are plugged into
factory-reset.target and factory-reset-now.target respectively. The
tool also provides a simple Varlink API.
This should make things a lot cleaner, and both be useful as explicit
implementation on UEFI, and as template + hookpoints for alternative
implementations on non-UEFI.
This reverts commit b6a2df6307.
The functionality is entirely redundant, we already have
sd_json_variant_filter() which does the same, and is in fact even more
powerful, since it takes a list instead of a single field to remove.
This is mostly just a friendly unit wrapper around "systemd-dissect
--attach".
This is useful so that we can automatically attach disk images as
block device at boot.
In the initrd we want to run as early as possible, before
any of the filesystems are set up, so that users can use
sysexts to customize kernel modules, firmware, etc. But
in the root fs it needs to run after /var/ has been set
up. Split the unit, and have an initrd-specific one that
runs very early.
In the initrd we want to run as early as possible, before
any of the filesystems are set up, so that users can use
confexts to customize fstab/veritytab/crypttab/etc. But
in the root fs it needs to run after /var/ has been set
up. Split the unit, and have an initrd-specific one that
runs very early.
Let's gather generic key/certificate operations in a new tool
systemd-keyutil instead of spreading them across various special
purpose tools.
Fixes#35087
We have this in a similar fashion for the other APIs libsystemd
provides. Add the same for sd-varlink. There isn't too much on it for
now, but at least it's a start.
Also link it up everywhere.
This commit introduces a build-time option to enable/disable sysupdated
separately from sysupdate. 'auto' translated to enabled by default in
developer builds.
Optional features allow distros to define sets of transfers that can
be enabled or disabled by the system administrator. This is useful for
situations where a distro may want to ship some resources version-locked
to the core OS, but many people have no need for the resource, such as:
development tools/compilers, drivers for specialized hardware, language
packs, etc
We also rename sysupdate.d/*.conf -> sysupdate.d/*.transfer, because
now there are more than one type of definition in sysupdate.d/. For
backwards compat, we still load *.conf files as long as no *.transfer
files are found and the *.conf files don't try to declare themselves
as part of any features
Fixes https://github.com/systemd/systemd/issues/33343
Fixes https://github.com/systemd/systemd/issues/33344
To create the sd_device object of a driver, the function
sd_device_new_from_subsystem_sysname() requires "drivers" for subsystem
and e.g. "pci:iwlwifi" for sysname. Similarly, sd_device_new_from_device_id()
also requires driver subsystem. However, we have never provided a
way to get the driver subsystem ("pci" for the previous example) from
an existing sd_device object.
Let's introduce a way to get driver subsystem.
This adds a small, socket-activated Varlink daemon that can delegate UID
ranges for user namespaces to clients asking for it.
The primary call is AllocateUserRange() where the user passes in an
uninitialized userns fd, which is then set up.
There are other calls that allow assigning a mount fd to a userns
allocated that way, to set up permissions for a cgroup subtree, and to
allocate a veth for such a user namespace.
Since the UID assignments are supposed to be transitive, i.e. not
permanent, care is taken to ensure that users cannot create inodes owned
by these UIDs, so that persistancy cannot be acquired. This is
implemented via a BPF-LSM module that ensures that any member of a
userns allocated that way cannot create files unless the mount it
operates on is owned by the userns itself, or is explicitly
allowelisted.
BPF LSM program with contributions from Alexei Starovoitov.
stale HibernateLocation EFI variable
Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.
There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
unit that only runs after switch-root and clears the var.
Fixes#32021