There might be a delay between an umount and a refcounted device
to disappear, so the test can be flaky:
[ 36.107128] TEST-50-DISSECT.sh[1662]: ++ dmsetup ls
[ 36.108314] TEST-50-DISSECT.sh[1663]: ++ grep loop
[ 36.109283] TEST-50-DISSECT.sh[1664]: ++ grep -c verity
[ 36.110284] TEST-50-DISSECT.sh[1360]: + test 1 -eq 1
[ 36.111555] TEST-50-DISSECT.sh[1360]: + umount -R /tmp/TEST-50-IMAGES.hxm/mount
[ 36.112237] TEST-50-DISSECT.sh[1668]: ++ dmsetup ls
[ 36.113039] TEST-50-DISSECT.sh[1669]: ++ grep loop
[ 36.113833] TEST-50-DISSECT.sh[1670]: ++ grep -c verity
[ 36.114517] TEST-50-DISSECT.sh[1360]: + test 0 -eq 1
[ 36.116734] TEST-50-DISSECT.sh[1000]: + echo 'Subtest /usr/lib/systemd/tests/testdata/units/TEST-50-DISSECT.dissect.sh failed'
https://github.com/systemd/systemd/actions/runs/19062162467/job/54444112653?pr=39540#logs
Switch to searching for the dm entry and check for it specifically,
and wait for it to disappear before checking that it is no longer
in the dm table.
Follow-up for 10fc43e504
TEST-07-PID.user-namespace-path.sh is flaky as Type=simple is used
(implicitly), explicitly use Type=exec instead to ensure the namespaces
are created before starting another service reusing the same namespaces.
Fixes#39546.
Both sysext and confext used the host's /etc/initrd-release file even
when --root=/somewhere was specified. A workaround was the
SYSTEMD_IN_INITRD= env var but without knowing this it was quite
confusing. Aside from users validating their extensions, the primary
use case for this to matter is when the extensions are set up from the
initrd where the initrd-release file is present when running but we want
to prepare the extensions for the final system and thus should match
for the right scope.
Make systemd-sysext check for /etc/initrd-release inside the given
--root= tree. An alternative would be to always ignore the
initrd-release check when --root= is passed but this way it is more
consistent. The image policy logic for EFI-loader-passed extensions
won't take effect when --root= is used, though.
The last sysext test leaked things into new tests added later,
uncovered by any new tests leftover check.
Remove the mutable folder state through a trap as done in other tests.
This allows a service to reuse the user namespace created for an
existing service, similarly to NetworkNamespacePath=. The configuration
is the initial user namespace (e.g. ID mapping) is preserved.
I recently found out (the hard way) that on an older version
there was a bug when the verity sharing is disabled: the
deferred close flag was not set correctly, so verity devices
were leaked.
This is not an issue in main currently, but add a test case
to cover it just in case, to avoid future regressions.
RootDirectory= but via a open_tree() file descriptor. This allows
setting up the execution environment for a service by the client in a
mount namespace and then starting a transient unit in that execution
environment using the new property.
We also add --root-directory= and --same-root-dir= to systemd-run to
have it run services within the given root directory. As systemd-run
might be invoked from a different mount namespace than what systemd is
running in, systemd-run opens the given path with open_tree() and then
sends it to systemd using the new RootDirectoryFileDescriptor= property.
RootDirectory= but via a open_tree() file descriptor. This allows
setting up the execution environment for a service by the client in
a mount namespace and then starting a transient unit in that execution
environment using the new property.
We also add --root-directory= and --same-root-dir= to systemd-run to
have it run services within the given root directory. As systemd-run
might be invoked from a different mount namespace than what systemd is
running in, systemd-run opens the given path with open_tree() and then
sends it to systemd using the new RootDirectoryFileDescriptor= property.
--empower gives full privileges to a non-root user. Currently this
includes all capabilities but we leave the option open to add more
privileges via this option in the future.
Why is this useful? When running privileged development or debugging
commands from your home directory (think bpftrace, strace and such),
you want any files written by these tools to be owned by your current
user, and not by the root user. run0 --empower will allow you to run
all privileged operations (assuming the tools check for capabilities
and not UIDs), while any files written by the tools will still be owned
by the current user.
As 'systemctl stop' is called with --no-block, previously systemd-resolved
might not be stopped when 'resolvectl' is called, and the DBus connection
might be closed during the call:
```
TEST-07-PID1.sh[5643]: + systemctl stop --no-block systemd-resolved.service
TEST-07-PID1.sh[5643]: + resolvectl
TEST-07-PID1.sh[5732]: Failed to get global data: Remote peer disconnected
```
Follow-up for 8eefd0f4de.
Fixes https://github.com/systemd/systemd/pull/39388#issuecomment-3439277442.
As the modified service requires about ~10 seconds for stopping, the
service never hit the start limit even if we tried to restart the
service more than 5 times.
This also checks that the service is actually triggered by dbus method
call.
Follow-up for 8eefd0f4de.
Otherwise, e.g. requesting to start a unit that is under stopping may
enter the failed state.
This makes
- rename .can_start() -> .test_startable(), and make it allow to return
boolean and refuse to start units when it returns false,
- refuse earlier to start units that are in the deactivating state, so
several redundant conditions in .start() can be dropped,
- move checks for unit states mapped to UNIT_ACTIVATING from .start() to
.test_startable().
Fixes#39247.
The process forked off by `systemd-notify --fork` is not a child of the
current shell, so using `wait` doesn't work. This then later causes a
race, when the test occasionally fails because it attempts to start a
new systemd-socket-activate instance before the old one is completely
gone:
[ 1488.947744] TEST-74-AUX-UTILS.sh[1938]: Child 1947 died with code 0
[ 1488.947952] TEST-74-AUX-UTILS.sh[1933]: + assert_eq hello hello
[ 1488.949716] TEST-74-AUX-UTILS.sh[1948]: + set +ex
[ 1488.950112] TEST-74-AUX-UTILS.sh[1950]: ++ cat /proc/1938/comm
[ 1488.945555] systemd[1]: Started systemd-networkd.service - Network Management.
[ 1488.950365] TEST-74-AUX-UTILS.sh[1933]: + assert_in systemd-socket systemd-socket-
[ 1488.950563] TEST-74-AUX-UTILS.sh[1951]: + set +ex
[ 1488.950766] TEST-74-AUX-UTILS.sh[1933]: + kill 1938
[ 1488.950766] TEST-74-AUX-UTILS.sh[1933]: + wait 1938
[ 1488.950766] TEST-74-AUX-UTILS.sh[1933]: .//usr/lib/systemd/tests/testdata/units/TEST-74-AUX-UTILS.socket-activate.sh: line 14: wait: pid 1938 is not a child of this shell
[ 1488.950766] TEST-74-AUX-UTILS.sh[1933]: + :
[ 1488.951486] TEST-74-AUX-UTILS.sh[1952]: ++ systemd-notify --fork -- systemd-socket-activate -l 1234 --now socat ACCEPT-FD:3 PIPE
[ 1488.952222] TEST-74-AUX-UTILS.sh[1953]: Failed to listen on [::]🔢 Address already in use
[ 1488.952222] TEST-74-AUX-UTILS.sh[1953]: Failed to open '1234': Address already in use
[ 1488.956831] TEST-74-AUX-UTILS.sh[1933]: + PID=1953
[ 1488.957078] TEST-74-AUX-UTILS.sh[102]: + echo 'Subtest /usr/lib/systemd/tests/testdata/units/TEST-74-AUX-UTILS.socket-activate.sh failed'
[ 1488.957078] TEST-74-AUX-UTILS.sh[102]: Subtest /usr/lib/systemd/tests/testdata/units/TEST-74-AUX-UTILS.socket-activate.sh failed
This is useful when we start to call mountfsd from root, for example
from the tests where we just use a simple squashfs/erofs.
Note that this requires the caller to be root, and it will be rejected
otherwise, as such images are classified as 'unprotected' and the
enforced policy does not accept them for unprivileged users.
Fixes the following warning:
TEST-75-RESOLVED.sh[2251]: ++ restart_resolved
TEST-75-RESOLVED.sh[2251]: ++ systemctl stop systemd-resolved.service
TEST-75-RESOLVED.sh[2271]: Stopping 'systemd-resolved.service', but its triggering units are still active:
TEST-75-RESOLVED.sh[2271]: systemd-resolved-monitor.socket, systemd-resolved-varlink.socket
This test occasionally fails due to a race where systemd processes
kernel's SIGKILL before the OOM notification, so the test service dies
with Result=signal instead of the expected Result=oom-kill:
[ 51.008765] TEST-55-OOMD.sh[906]: + systemd-run --wait --unit oom-kill -p OOMPolicy=kill -p Delegate=yes -p DelegateSubgroup=init.scope /tmp/script.sh
[ 51.048747] TEST-55-OOMD.sh[907]: Running as unit: oom-kill.service; invocation ID: 456645347d554ea2878463404b181bd8
[ 51.066296] sysrq: Manual OOM execution
[ 51.066596] kworker/1:0 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=-1, oom_score_adj=0
[ 51.066915] CPU: 1 UID: 0 PID: 27 Comm: kworker/1:0 Not tainted 6.17.1-arch1-1 #1 PREEMPT(full) d2b229857b2eb4001337041f41d3c4f131433540
[ 51.066919] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.17.0-2-2 04/01/2014
[ 51.066921] Workqueue: events moom_callback
[ 51.066928] Call Trace:
[ 51.066931] <TASK>
[ 51.066936] dump_stack_lvl+0x5d/0x80
[ 51.066942] dump_header+0x43/0x1aa
<...snip...>
[ 51.087814] 47583 pages reserved
[ 51.087969] 0 pages cma reserved
[ 51.088208] 0 pages hwpoisoned
[ 51.088519] Out of memory: Killed process 908 (sleep) total-vm:3264kB, anon-rss:256kB, file-rss:1916kB, shmem-rss:0kB, UID:0 pgtables:44kB oom_score_adj:1000
[ 51.090263] TEST-55-OOMD.sh[907]: Finished with result: signal
[ 51.094416] TEST-55-OOMD.sh[907]: Main processes terminated with: code=killed, status=9/KILL
[ 51.094898] TEST-55-OOMD.sh[907]: Service runtime: 58ms
[ 51.095436] TEST-55-OOMD.sh[907]: CPU time consumed: 22ms
[ 51.095854] TEST-55-OOMD.sh[907]: Memory peak: 1.6M (swap: 0B)
[ 51.096722] TEST-55-OOMD.sh[912]: ++ systemctl show oom-kill -P Result
[ 51.106549] TEST-55-OOMD.sh[879]: + assert_eq signal oom-kill
[ 51.107394] TEST-55-OOMD.sh[913]: + set +ex
[ 51.108256] TEST-55-OOMD.sh[913]: FAIL: expected: 'oom-kill' actual: 'signal'
[FAILED] Failed to start TEST-55-OOMD.service.
To mitigate this, let's spawn a child process and move it to the
subcgroup to get killed instead of the main process, so systemd has more
time to react to the OOM notification and terminate the service with the
expected oom-kill result.
Needed to implement support for RootHashSignature=/RootVerity=/RootHash=
and friends when going through mountfsd, for example with user units,
so that system and user units provide the same features at the same
level
RootDirectory= and other options already implicitly enable PrivateUsers=
since 6ef721cbc7 if they are set in user
units, so that they can work out of the box.
Now with mountfsd support we can do the same for the images settings,
so enable them and document them.
It looks like the 4 second sleep might not be enough on some slower
machines (like the ARM GH Actions nodes) which can lead to the DS RRs
propagation to clash with the manual test zone edit, and the
signed.test zone then might end up not properly signed:
TEST-75-RESOLVED.sh[749]: + : '--- ZONE: signed.test (static DNSSEC) ---'
TEST-75-RESOLVED.sh[749]: + run_delv @ns1.unsigned.test signed.test
TEST-75-RESOLVED.sh[749]: + run delv -a /etc/bind.keys @ns1.unsigned.test signed.test
TEST-75-RESOLVED.sh[778]: + delv -a /etc/bind.keys @ns1.unsigned.test signed.test
TEST-75-RESOLVED.sh[779]: + tee /tmp/tmp.2KOIiyrgth
TEST-75-RESOLVED.sh[779]: ;; /etc/bind.keys:1: option 'managed-keys' is deprecated
TEST-75-RESOLVED.sh[779]: ;; validating signed.test/DS: no valid signature found
TEST-75-RESOLVED.sh[779]: ;; validating signed.test/A: no valid signature found
TEST-75-RESOLVED.sh[779]: ; unsigned answer
TEST-75-RESOLVED.sh[779]: signed.test. 86400 IN A 10.0.0.10
TEST-75-RESOLVED.sh[779]: signed.test. 86400 IN RRSIG A 13 2 86400 20251028114356 20251014101356 39330 signed.test. oo3ca8WPusbBPRhzsEKw3bsBBqFtI8i4bckoMVNzt7lY+udGW6PlaSYj OjpQGgY9oglowVM9bteNtwJKHUbvtw==
TEST-75-RESOLVED.sh[749]: + grep -qF '; fully validated' /tmp/tmp.2KOIiyrgth
[FAILED] Failed to start TEST-75-RESOLVED.service - TEST-75-RESOLVED.
Let's explicitly wait for the DS records propagation to finish before we
start editing the test zone to avoid this.
I'm still not completely sure if this is the root cause, but it's the
best shot I currently have, so I'll let the CIs decide.
Let's reduce our attack surface by insisting that XBOOTLDR is vfat when
auto-probing, just like we do for the ESP. Given neither can
realistically be integrity protected (because firmware needs to access
them) let's insist on a vfat which has a much smaller attack surface,
and one we have to accept (for now) anyway, given that the ESP must be
VFAT.
This only applies to auto-probing of course. If people mount things
explicitly via fstab none of this matters. But we really shouldn't
automount a btrfs/xfs/ext4 partition as XBOOTLDR just because it looks
like one, as that would really defeat our otherwise possibly very strict
image policies.
This also introduces a new env var $SYSTEMD_DISSECT_FSTYPE_<DESIGNATOR>
environment variable that may override this hardcoding. This is in
particular useful in our testcases, since various actually do use ext4
as XBOOTLDR case. The tests are updated to make use of the new env var,
both as a mechanism to test this and to keep the tests working.
Previously, we'd take the image policy only into consideration when
dissecting the mage, but for the unlock/verity step we'd go via best
effort. Change that. This means we can now enforce policies such as
activating by root hash only even if a signature exists and similar.
Also, introduce a separate error code if we try to unlock a Verity
volume but have no root hash. Previously we'd return ENOKEY for that,
exactly like we do for encrypted volumes where we have no passparse. The
interctive unlock loop dissected_image_decrypt_interactively() is
otherwise very confused and will ask for a root hash, which makes no
sense. Hence use two distinct errors for this.