Let's synthesize DNS RRs for leases handed out by our DHCP server. This
way local VMs can have resolvable hostnames locally.
This does not implement reverse look ups for now. We can add this
later in similar fashion.
This introduces /run/systemd/resolve.hook/ as a new directory that local
(privileged) programs can bind a Varlink socket into. If they do they'll
get a method call for each attempted resolved lookup, which they can
then either process themselves (and generate new records for, or return
errors to block stuff) or let pass so that the regular resolution is
done.
Usecase for this is primarily two things:
1. in machined we can add local resolution of machine names to their IP
addresses, similar in fashion to nss-mymachines, but working also if
the non-NSS interfaces to name resolution are used, i.e. the local
DNS responder. In fact, I think we should eventually remove
nss-mymachines from our tree, as soon as this code in resolved is
setlled.
2. in networkd we can add local resolution of names specified in DHCP
leases we hand out.
But beyond that there should be many other uses, for example people
could write "dns firewalls" with this if they like where they
dynamically block certain names from resolution.
Fixes: #8518
Reverts systemd/systemd#38680
After taking a closer look I'm not convinced by the approach, see below.
First of all, all other SD_PATH_SEARCH_* are either somewhat generic,
i.e. encode the common prefix for configurations, binaries, etc., or are
subdirectories under systemd/ hence in our own "domain". The
tmpfiles/sysctl/binfmt we don't prefix with "systemd" precisely because
the concept is generic and there're actually other impls of them. A
specific SD_PATH_SEARCH_SYSCTL doesn't fit into our existing scheme.
Instead something along the lines of "SEARCH_SYSTEM_CONFIGURATION" shall
be introduced, and consumers will just suffix
sysctl.d/tmpfiles.d/binfmt.d for the final result.
And secondly, I don't grok why systemd-sysctl now unnecessarily calls
into sd-path to obtain the fixed search path. None of our other tools do
that.
-----------
An alternate approach, SD_PATH_SYSTEM_SEARCH_CONFIGURATION, which does
exactly above, will be introduced instead. It provides a universal
interface for querying any system config with our idiomatic
/etc/:/run/:/usr/local/lib/:/usr/lib/ hierarchy.
TPM2 support is not too useful if the firmware doesn't actually use it
for the boot chain, hence we require the full PC client profile support.
Let's make that clear in the docs.
Fixes: #38939
We do this in a separate service (rather than inside of
systemd-tpm2-setup), since we want failures of this measurement to
result in an instant reboot, like for most our measurements.
Failures to initialize nvpcrs, or allocate an SRK are somewhat OK (and
more likely), as long as this separator communicates clearly where they
have to have taken place, if they worked.
Sometimes it's hard to assign responsibility to a specific event source
for exiting when there's no more work to be done. So let's add exit-on-idle
support where we exit when there are no more event sources.
This enables running something like
"mkosi box -- run0 --empower --same-root-dir -E PATH" to get an
empowered session as the current user within the "mkosi box" environment.
Currently the only supported integrity algorithm using HMAC is
`hmac-sha256`. Add `hmac-sha512` to the list of supported algorithms as
well.
Also add the `PHMAC` integrity algorithm to the list of supported
algorithms. The `PHMAC` algorithm is like the regular HMAC algorithm,
but it takes a wrapped key as input. A key for the `PHMAC` algorithm is
an opaque key blob, who's physical size has nothing to do with the
cryptographic size. Such a wrapped key can for example be a HSM
protected key. Currently PHMAC is only available for the s390x
architecture (Linux on IBM Z).
Support for PHMAC has just been added to the cryptsetup project via MR
https://gitlab.com/cryptsetup/cryptsetup/-/merge_requests/693 by commit
296eb39c60
To allow automatic opening of integrity protected volumes that use PHMAC
via `/etc/integritytab`, this change in systemd's integritysetup tool is
needed as well.
This adds a new `Hostname=` option to the [DHCPServerStaticLease]
section in .network files, allowing an administrator to assign a
specific hostname to a client receiving a static lease.
We automatically select the correct DHCP option to use based on the
format of the provided string:
- Single DNS labels are sent as Option 12.
- Names with multiple DNS labels are sent as Option 81 in wire format.
Fixes: #39634
Add the PHMAC integrity algorithm to the list of supported algorithms.
The PHMAC algorithm is like the regular HMAC algorithm, but it takes a wrapped key
as input. A key for the PHMAC algorithm is an opaque key blob, who's physical size
has nothing to do with the cryptographic size. Currently PHMAC is only available
for the s390x architecture.
When we use stdio-bridge via sd-bus to connect to a bus of a different
user, container or host, stdio-bridge should not log at error level but
at debug level as it's invoked by the sd-bus library and sd-bus should
generally not log above debug level.
We can't actually use the --quiet option yet as that would break connecting
to hosts running older versions of systemd but let's already add the option
now in preparation for a brighter future.
This provides functionality to replace what was provided by the preceding
revert:
$ build/systemd-path system-search-configuration --suffix=sysctl.d
/etc/sysctl.d:/run/sysctl.d:/usr/local/lib/sysctl.d:/usr/lib/sysctl.d
The result is identical, but more generic, since by changing suffix we can also
get the answer for sysusers.d, tmpfiles.d, and any other of the directories
which follow the same general rule.
So far the idea was that the default is 'auto', and if appropriate, the
distribution will create /var/log/journal/ to tell journald to use persistent
mode. This doesn't work well with factory resets, because after a factory reset
obviously /var/log is gone. That old default was useful when journald was new
and people were reluctant to enable persistent mode and instead relied on
rsyslog and such for the persistent storage. But nowadays that is rarer, and
anyway various features like user journals only work with persistent storage,
so we want people to enable this by default. Add an option to flip the default
and distributions can opt in. The default default value remains unchanged.
(I also tested using tmpfiles to instead change this, since we already set
access mode for /var/log/journal through tmpfiles. Unfortunately, tmpfiles runs
too late, after journald has already started, so if tmpfiles creates the
directory, it'll only be used after a reboot. This probably could be made to
work by adding a new service to flush the journal, but that becomes complicated
and we lose the main advantage of simplicity.)
Resolves https://bugzilla.redhat.com/show_bug.cgi?id=1387796.
For some reason, the entity names configured in custom-entities.ent
used abbreviated names. This just creates unnecessary confusion, so update
to use the same name as the config dict.
Reword some surrounding sentences while at it.
Closes#3829
Alternative to #35417
I don't think the individual "WasOnDependencyCycle" attrs on units
are particularly helpful and comprehensible, as it's really about
the dep relationship between them. And as discussed, the dependency
cycle is not something persistent, rather local to the currently
loaded set of units and shall be reset with daemon-reload (see also
https://github.com/systemd/systemd/issues/35642#issuecomment-2591296586).
Hence, let's report system state as degraded and point users to
the involved transactions when ordering cycles are encountered instead.
Combined with log messages added in 6912eb315f
it should achieve the goal of making ordering cycles more observable,
while avoiding all sorts of subtle bookkeeping in the service manager.
The degraded state can be reset via the existing ResetFailed() manager-wide
method.
A --empower session is effectively root without being UID 0, so it
doesn't make sense to enforce polkit authentication in those. Let's
add the empower group, add --empower sessions to that group and ship
a polkit rule to skip authentication for all users in the empower
group.
(As a side-effect this will also allow users to add themselves to this
group outside of 'run0 --empower' to mimick NOPASSWD from sudo)
Aim of this patches set, is to add a new type SD_PATH_SEARCH_SYSCTL for
sd_path_lookup() and sd_path_lookup_strv(). This new type is used to get the
directories list used by systemd-sysctl:
- /etc/sysctl.d/
- /run/sysctl.d/
- /usr/local/lib/sysctl.d/
- /usr/lib/sysctl.d/
This implements the change in libsystemd, systemd-path, and systemd-sysctl.
Add the new type SD_PATH_SEARCH_SYSCTL to libsystemd.
With this new type sd_path_lookup() and sd_path_lookup_strv() will
return the paths used by systemd-sysctl(1) to search the .conf files:
/etc/sysctl.d/
/run/sysctl.d/
/usr/local/lib/sysctl.d/
/usr/lib/sysctl.d/
Refer to sysctl.d(5) man page.
Note: the old type SD_PATH_SYSCTL is still available, and returns the
last path (/usr/lib/sysctl.d/).
systemd.service(5)’s documentation of `ExecCondition=` uses “failed” with
respect to the unit active state.
In particular the unit won’t be considered failed when `ExecCondition=`’s
command exits with a status of 1 through 254 (inclusive). It will however, when
it exits with 255 or abnormally (e.g. timeout, killed by a signal, etc.).
The table “Defined $SERVICE_RESULT values” in systemd.exec(5) uses “failed”
however rather with respect to the condition.
Tests seem to have shown that, if the exit status of the `ExecCondition=`
command is one of 1 through 254 (inclusive), `$SERVICE_RESULT` will be
`exec-condition`, if it is 255, `$SERVICE_RESULT` will be `exit-code` (but
`$EXIT_CODE` and `$EXIT_STATUS` will be empty or unset), if it’s killed because
of `SIGKILL`, `$SERVICE_RESULT` will `signal` and if it times out,
`$SERVICE_RESULT` will be `timeout`.
This commit clarifies the table at least for the case of an exit status of 1
through 254 (inclusive).
The others (signal, timeout and 255 are probably also still ambiguous (e.g.
`signal` uses “A service process”, which could be considered as the actual
service process only).
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
When Type=notify-reload got introduced, it wasn't intended to be
mutually exclusive with ExecReload=. However, currently ExecReload=
is immediately forked off after the service main process is signaled,
leaving states in between essentially undefined. Given so broken
it is I doubt any sane user is using this setup, hence I took a stab
to rework everything:
1. Extensions are refreshed (unchanged)
2. ExecReload= is forked off without signaling the process
3a. If RELOADING=1 is sent during the ExecReload= invocation,
we'd refrain from signaling the process again, instead
just transition to SERVICE_RELOAD_NOTIFY directly and
wait for READY=1
3b. If not, signal the process after ExecReload= finishes
(from now on the same as Type=notify-reload w/o ExecReload=)
4. To accomodate the use case of performing post-reload tasks,
ExecReloadPost= is introduced which executes after READY=1
The new model greatly simplifies things, as no control processes
will be around in SERVICE_RELOAD_SIGNAL and SERVICE_RELOAD_NOTIFY
states.
See also: https://github.com/systemd/systemd/issues/37515#issuecomment-2891229652
This allows a service to reuse the user namespace created for an
existing service, similarly to NetworkNamespacePath=. The configuration
is the initial user namespace (e.g. ID mapping) is preserved.
This reworkds TPM2 based creds a bit. Instead of mapping the key type
"tpm2" directly to a TPM2 key without PK, let's map it to an "automatic"
key type that either picks PK or doesn't, depending on what's available.
That should make things easier to grok for people, as the nitty gritty
details of PK or not PK are made autmatic. Moreover it gives us more
leverage to change the TPM2 enrollment types later (for example, we
definitely want to start pinning SRK, and hook up pcrlock too, for
creds, which we currently don't).
This hence adds a new _CRED_AUTO_TPM2
pseudo-type we automatically maps to CRED_AES256_GCM_BY_TPM2_HMAC_WITH_PK
or CRED_AES256_GCM_BY_TPM2_HMAC depending if PK as available. Similar,
_CRED_AUTO_HOST_AND_TPM2 is added, which does the same for the
host/nonhost cred type.
This does not introduce any new type on the wire, it just changes how we
select the right key type.
To make the code more readable this also adds some categorization macros
for the keys, instead of repeating the list of key types at multiple
places.
Over the time, the functionality in ukify has grown. This should all be briefly
mentioned in the first section so the user does't have to read the whole page
to figure out what types of functionality are implemnted.
Also add an example of direct kernel boot. It's a nifty technology (and frankly
underutilized, considering how cool it is is).
Unfortunately qemu still default to BIOS boot, so for the direct kernel
boot with an efi file to be of any use, the complex param used to switch
to UEFI mode needs to be provided.
Also add some links to qemu and OVMF.
In btrfs-progs 6.15 it is planned to add a new parameter in mkfs.btrfs
--inode-flags, that can set attributes for subvolumes, directories, and
files.
The current supported attributes are "nodatacow", to disable CoW, and
"nodatasum", to disable the checksum.
This commit extend the "Subvolunes=" option to understand the
"nodatacow" flag for subvolums only.
If RepartOffline is enabled it will build the image without loopback
devices, using the correct --inode-flags parameters.
If RepartOffline is disabled it will use loopback devices and set the
btrfs attributes accordingly.
Signed-off-by: Alberto Planas <aplanas@suse.com>
Type `simple` explicitly mentions that invocation failures like a missing binary
or `User=` name won’t get detected – whereas type `exec` mentions that it does.
Type `oneshot` refers to being similar to `simple`, which could lead one to
assume it doesn’t detect such invocation failures either – it seems however it
does.
Indicate this my changing its wording to be similar to `exec`.
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>