This enables running something like
"mkosi box -- run0 --empower --same-root-dir -E PATH" to get an
empowered session as the current user within the "mkosi box" environment.
Currently the only supported integrity algorithm using HMAC is
`hmac-sha256`. Add `hmac-sha512` to the list of supported algorithms as
well.
Also add the `PHMAC` integrity algorithm to the list of supported
algorithms. The `PHMAC` algorithm is like the regular HMAC algorithm,
but it takes a wrapped key as input. A key for the `PHMAC` algorithm is
an opaque key blob, who's physical size has nothing to do with the
cryptographic size. Such a wrapped key can for example be a HSM
protected key. Currently PHMAC is only available for the s390x
architecture (Linux on IBM Z).
Support for PHMAC has just been added to the cryptsetup project via MR
https://gitlab.com/cryptsetup/cryptsetup/-/merge_requests/693 by commit
296eb39c60
To allow automatic opening of integrity protected volumes that use PHMAC
via `/etc/integritytab`, this change in systemd's integritysetup tool is
needed as well.
This adds a new `Hostname=` option to the [DHCPServerStaticLease]
section in .network files, allowing an administrator to assign a
specific hostname to a client receiving a static lease.
We automatically select the correct DHCP option to use based on the
format of the provided string:
- Single DNS labels are sent as Option 12.
- Names with multiple DNS labels are sent as Option 81 in wire format.
Fixes: #39634
Add the PHMAC integrity algorithm to the list of supported algorithms.
The PHMAC algorithm is like the regular HMAC algorithm, but it takes a wrapped key
as input. A key for the PHMAC algorithm is an opaque key blob, who's physical size
has nothing to do with the cryptographic size. Currently PHMAC is only available
for the s390x architecture.
When we use stdio-bridge via sd-bus to connect to a bus of a different
user, container or host, stdio-bridge should not log at error level but
at debug level as it's invoked by the sd-bus library and sd-bus should
generally not log above debug level.
We can't actually use the --quiet option yet as that would break connecting
to hosts running older versions of systemd but let's already add the option
now in preparation for a brighter future.
So far the idea was that the default is 'auto', and if appropriate, the
distribution will create /var/log/journal/ to tell journald to use persistent
mode. This doesn't work well with factory resets, because after a factory reset
obviously /var/log is gone. That old default was useful when journald was new
and people were reluctant to enable persistent mode and instead relied on
rsyslog and such for the persistent storage. But nowadays that is rarer, and
anyway various features like user journals only work with persistent storage,
so we want people to enable this by default. Add an option to flip the default
and distributions can opt in. The default default value remains unchanged.
(I also tested using tmpfiles to instead change this, since we already set
access mode for /var/log/journal through tmpfiles. Unfortunately, tmpfiles runs
too late, after journald has already started, so if tmpfiles creates the
directory, it'll only be used after a reboot. This probably could be made to
work by adding a new service to flush the journal, but that becomes complicated
and we lose the main advantage of simplicity.)
Resolves https://bugzilla.redhat.com/show_bug.cgi?id=1387796.
For some reason, the entity names configured in custom-entities.ent
used abbreviated names. This just creates unnecessary confusion, so update
to use the same name as the config dict.
Reword some surrounding sentences while at it.
Closes#3829
Alternative to #35417
I don't think the individual "WasOnDependencyCycle" attrs on units
are particularly helpful and comprehensible, as it's really about
the dep relationship between them. And as discussed, the dependency
cycle is not something persistent, rather local to the currently
loaded set of units and shall be reset with daemon-reload (see also
https://github.com/systemd/systemd/issues/35642#issuecomment-2591296586).
Hence, let's report system state as degraded and point users to
the involved transactions when ordering cycles are encountered instead.
Combined with log messages added in 6912eb315f
it should achieve the goal of making ordering cycles more observable,
while avoiding all sorts of subtle bookkeeping in the service manager.
The degraded state can be reset via the existing ResetFailed() manager-wide
method.
A --empower session is effectively root without being UID 0, so it
doesn't make sense to enforce polkit authentication in those. Let's
add the empower group, add --empower sessions to that group and ship
a polkit rule to skip authentication for all users in the empower
group.
(As a side-effect this will also allow users to add themselves to this
group outside of 'run0 --empower' to mimick NOPASSWD from sudo)
Aim of this patches set, is to add a new type SD_PATH_SEARCH_SYSCTL for
sd_path_lookup() and sd_path_lookup_strv(). This new type is used to get the
directories list used by systemd-sysctl:
- /etc/sysctl.d/
- /run/sysctl.d/
- /usr/local/lib/sysctl.d/
- /usr/lib/sysctl.d/
This implements the change in libsystemd, systemd-path, and systemd-sysctl.
Add the new type SD_PATH_SEARCH_SYSCTL to libsystemd.
With this new type sd_path_lookup() and sd_path_lookup_strv() will
return the paths used by systemd-sysctl(1) to search the .conf files:
/etc/sysctl.d/
/run/sysctl.d/
/usr/local/lib/sysctl.d/
/usr/lib/sysctl.d/
Refer to sysctl.d(5) man page.
Note: the old type SD_PATH_SYSCTL is still available, and returns the
last path (/usr/lib/sysctl.d/).
systemd.service(5)’s documentation of `ExecCondition=` uses “failed” with
respect to the unit active state.
In particular the unit won’t be considered failed when `ExecCondition=`’s
command exits with a status of 1 through 254 (inclusive). It will however, when
it exits with 255 or abnormally (e.g. timeout, killed by a signal, etc.).
The table “Defined $SERVICE_RESULT values” in systemd.exec(5) uses “failed”
however rather with respect to the condition.
Tests seem to have shown that, if the exit status of the `ExecCondition=`
command is one of 1 through 254 (inclusive), `$SERVICE_RESULT` will be
`exec-condition`, if it is 255, `$SERVICE_RESULT` will be `exit-code` (but
`$EXIT_CODE` and `$EXIT_STATUS` will be empty or unset), if it’s killed because
of `SIGKILL`, `$SERVICE_RESULT` will `signal` and if it times out,
`$SERVICE_RESULT` will be `timeout`.
This commit clarifies the table at least for the case of an exit status of 1
through 254 (inclusive).
The others (signal, timeout and 255 are probably also still ambiguous (e.g.
`signal` uses “A service process”, which could be considered as the actual
service process only).
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
When Type=notify-reload got introduced, it wasn't intended to be
mutually exclusive with ExecReload=. However, currently ExecReload=
is immediately forked off after the service main process is signaled,
leaving states in between essentially undefined. Given so broken
it is I doubt any sane user is using this setup, hence I took a stab
to rework everything:
1. Extensions are refreshed (unchanged)
2. ExecReload= is forked off without signaling the process
3a. If RELOADING=1 is sent during the ExecReload= invocation,
we'd refrain from signaling the process again, instead
just transition to SERVICE_RELOAD_NOTIFY directly and
wait for READY=1
3b. If not, signal the process after ExecReload= finishes
(from now on the same as Type=notify-reload w/o ExecReload=)
4. To accomodate the use case of performing post-reload tasks,
ExecReloadPost= is introduced which executes after READY=1
The new model greatly simplifies things, as no control processes
will be around in SERVICE_RELOAD_SIGNAL and SERVICE_RELOAD_NOTIFY
states.
See also: https://github.com/systemd/systemd/issues/37515#issuecomment-2891229652
This allows a service to reuse the user namespace created for an
existing service, similarly to NetworkNamespacePath=. The configuration
is the initial user namespace (e.g. ID mapping) is preserved.
This reworkds TPM2 based creds a bit. Instead of mapping the key type
"tpm2" directly to a TPM2 key without PK, let's map it to an "automatic"
key type that either picks PK or doesn't, depending on what's available.
That should make things easier to grok for people, as the nitty gritty
details of PK or not PK are made autmatic. Moreover it gives us more
leverage to change the TPM2 enrollment types later (for example, we
definitely want to start pinning SRK, and hook up pcrlock too, for
creds, which we currently don't).
This hence adds a new _CRED_AUTO_TPM2
pseudo-type we automatically maps to CRED_AES256_GCM_BY_TPM2_HMAC_WITH_PK
or CRED_AES256_GCM_BY_TPM2_HMAC depending if PK as available. Similar,
_CRED_AUTO_HOST_AND_TPM2 is added, which does the same for the
host/nonhost cred type.
This does not introduce any new type on the wire, it just changes how we
select the right key type.
To make the code more readable this also adds some categorization macros
for the keys, instead of repeating the list of key types at multiple
places.
Over the time, the functionality in ukify has grown. This should all be briefly
mentioned in the first section so the user does't have to read the whole page
to figure out what types of functionality are implemnted.
Also add an example of direct kernel boot. It's a nifty technology (and frankly
underutilized, considering how cool it is is).
Unfortunately qemu still default to BIOS boot, so for the direct kernel
boot with an efi file to be of any use, the complex param used to switch
to UEFI mode needs to be provided.
Also add some links to qemu and OVMF.
In btrfs-progs 6.15 it is planned to add a new parameter in mkfs.btrfs
--inode-flags, that can set attributes for subvolumes, directories, and
files.
The current supported attributes are "nodatacow", to disable CoW, and
"nodatasum", to disable the checksum.
This commit extend the "Subvolunes=" option to understand the
"nodatacow" flag for subvolums only.
If RepartOffline is enabled it will build the image without loopback
devices, using the correct --inode-flags parameters.
If RepartOffline is disabled it will use loopback devices and set the
btrfs attributes accordingly.
Signed-off-by: Alberto Planas <aplanas@suse.com>
Type `simple` explicitly mentions that invocation failures like a missing binary
or `User=` name won’t get detected – whereas type `exec` mentions that it does.
Type `oneshot` refers to being similar to `simple`, which could lead one to
assume it doesn’t detect such invocation failures either – it seems however it
does.
Indicate this my changing its wording to be similar to `exec`.
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
So far repart always required specification of a device node. And if
none was specified, then we'd fine the node backing the root fs. Let's
optionally allow that the device node is explicitly not specified (i.e.
specified as "-" or ""), in which case we'll just print the size of the
minimal image given the definitions.
RootDirectory= but via a open_tree() file descriptor. This allows
setting up the execution environment for a service by the client in a
mount namespace and then starting a transient unit in that execution
environment using the new property.
We also add --root-directory= and --same-root-dir= to systemd-run to
have it run services within the given root directory. As systemd-run
might be invoked from a different mount namespace than what systemd is
running in, systemd-run opens the given path with open_tree() and then
sends it to systemd using the new RootDirectoryFileDescriptor= property.
RootDirectory= but via a open_tree() file descriptor. This allows
setting up the execution environment for a service by the client in
a mount namespace and then starting a transient unit in that execution
environment using the new property.
We also add --root-directory= and --same-root-dir= to systemd-run to
have it run services within the given root directory. As systemd-run
might be invoked from a different mount namespace than what systemd is
running in, systemd-run opens the given path with open_tree() and then
sends it to systemd using the new RootDirectoryFileDescriptor= property.
--empower gives full privileges to a non-root user. Currently this
includes all capabilities but we leave the option open to add more
privileges via this option in the future.
Why is this useful? When running privileged development or debugging
commands from your home directory (think bpftrace, strace and such),
you want any files written by these tools to be owned by your current
user, and not by the root user. run0 --empower will allow you to run
all privileged operations (assuming the tools check for capabilities
and not UIDs), while any files written by the tools will still be owned
by the current user.
This creates a chicken-and-egg problem: we stuff the pcrlock policy into
a credential in the ESP, but credentials get measured into PCR 12, hence
PCR 12 is both input and output of the pcrlock logic, which makes
impossible to calculate.
Let's drop PCR 12 for now.
(We might want to pass the policy some other way one day, to avoid this,
but that's something for another day.)
Note that this still allows locking to PCR12 if people want to (for
example because they don't need this for the rootfs, and hence need no
cred passing via the ESP), this hence only changes the default, nothing
more.
Fixes: #33546
So even if a <term> section contains newlines, we get a reasonable
anchor link to it.
Before:
```
<dt id="
bind
UNIT
PATH
[PATH]
"><span class="term">
...
<a class="headerlink" title="Permalink to this term" href="#%0A%20%20%20%20%20%20%20%20%20%20%20%20bind%0A%20%20%20%20%20%20%20%20%20%20%20%20UNIT%0A%20%20%20%20%20%20%20%20%20%20%20%20PATH%0A%20%20%20%20%20%20%20%20%20%20%20%20[PATH]%0A%20%20%20%20%20%20%20%20%20%20">¶</a>
```
After:
```
<dt id="bind UNIT PATH [PATH]"><span class="term">
...
<a class="headerlink" title="Permalink to this term" href="#bind%20UNIT%20PATH%20[PATH]">¶</a>
```
Resolves: https://github.com/systemd/systemd/issues/39196
---
The reverts are not strictly necessary here (as already pointed out in
https://github.com/systemd/systemd/pull/39154#issuecomment-3360118164)
but they were helpful in checking if the fix works as expected. I can
drop them if needed.