assert_signal_internal() returns 0 in two distinct cases:
1. In the child process (immediately after fork returns 0).
2. In the parent process, if the child exited normally (no signal).
ASSERT_SIGNAL fails to distinguish these cases. When a child exited
normally (case 2), the parent process receives 0, incorrectly interprets
it as meaning it is the child, and re-executes the test expression
inside the parent process. Goodness gracious!
This causes two severe test integrity issues:
1. False positives. The parent can run the expression, succeed, and call
_exit(EXIT_SUCCESS), causing the test to pass even though no signal
was raised.
2. Silent truncation. The _exit() call in the parent terminates the test
runner prematurely, preventing subsequent tests in the same file from
running.
Example of the bug in action, from #39674:
ASSERT_SIGNAL(fd_is_writable(closed_fd), SIGABRT)
This test should fail (fd_is_writable does not SIGABRT here), but with
the bug, the parent hallucinated being the child, re-ran the expression
successfully, and exited with success.
Fix this by refactoring assert_signal_internal() to be much more strict
about separating control flow from data.
The signal status is now returned via a strictly typed output parameter,
guaranteeing that determining whether we are the child is never
conflated with whether the child exited cleanly.
ASAN installs signal handlers to catch crashes like SIGSEGV or SIGILL.
When these signals are raised, ASAN traps them, prints an error report,
and then typically terminates the process with a different signal (often
SIGABRT) or a non-zero exit code.
This interferes with ASSERT_SIGNAL when checking for specific crash
signals (for example, checking that a function raises SIGSEGV). In such
a case, the test harness sees the ASAN termination signal rather than
the expected signal, causing the test to fail.
Fix this by resetting the signal handler to SIG_DFL in the child process
immediately before executing the test expression. This ensures the
kernel kills the process directly with the expected signal, bypassing
ASAN's interceptors.
When a child process exits normally (si_code == CLD_EXITED),
siginfo.si_status contains the exit code. When it is killed by a signal
(si_code == CLD_KILLED or CLD_DUMPED), si_status contains the signal
number. However, assert_signal_internal() returns si_status blindly.
This causes exit codes to be misinterpreted as signal numbers.
This allows failing tests to silently pass if their exit code
numerically coincides with the expected signal. For example, a test
expecting SIGABRT (6) would incorrectly pass if the child simply exited
with status 6 instead of being killed by a signal.
Fix this by checking si_code. Only return si_status as a signal number
if the child was actually killed by a signal (CLD_KILLED or CLD_DUMPED).
If the child exited normally (CLD_EXITED), return 0 to indicate that no
signal occurred.
The ASSERT_SIGNAL macro uses a fixed variable name, `_r`. This prevents
nesting the macro (like ASSERT_SIGNAL(ASSERT_SIGNAL(...))), as the inner
instance would shadow the outer instance's variable.
Switch to using the UNIQ_T helper to generate unique variable names at
each expansion level. This allows the macro to be used recursively,
which is required for upcoming regression tests regarding signal
handling logic.
Symbols exported by libshared can't get pruned by the linker, so
every unused exported symbol is effectively dead code we ship to users
for no good reason. Let's add a script to analyze how many such symbols
we have.
We also add a meson test to run the script on all of our binaries.
Since it detects unused symbols and still has a few false positives,
don't enable the test by default similar to the clang-tidy tests.
The script was 100% vibe coded by Github Copilot with Claude Sonnet 4.5
as the model.
Current results are (without the unused symbols list):
Analysis of libsystemd-shared-259.so
======================================================================
Total exported symbols: 4830
(excluding public API symbols starting with 'sd_')
Used symbols: 4672
Unused symbols: 158
Usage rate: 96.7%
`systemd.network(5)` recommends “that each filename is prefixed with a number
smaller than "70" (e.g. 10-eth0.network)”.
Reduce that used by the example accordingly, but stay above the number (`50`)
used in the earlier example for static configuration, so that would take
precedence over the dynamic one if both match for the same network.
This updates example output in systemd-analyze's man page after the
tool's output was changed in a previous commit.
Additionally bash completion is added for `systemd-analyze filesystems`
and improved for `systemd-analyze calendar`.
Before:
$ build/systemd-creds --uid=asdf
Failed to resolve user 'asdf': No such process
Now:
$ build/systemd-creds --uid=asdf
Failed to resolve user 'asdf': Unknown user
$ sudo build/systemd-run --uid=asdf whoami
$ journalctl -e
(whoami)[1007784]: run-p1007782-i5200512.service: Failed to determine user credentials: No such process
(whoami)[1007784]: run-p1007782-i5200512.service: Failed at step USER spawning /usr/sbin/whoami: No such process
systemd[1]: run-p1007782-i5200512.service: Main process exited, code=exited, status=217/USER
systemd[1]: run-p1007782-i5200512.service: Failed with result 'exit-code'.
Now:
(whoami)[1013204]: run-p1013202-i5205932.service: Failed to determine credentials for user 'asdf': Unknown user
(whoami)[1013204]: run-p1013202-i5205932.service: Failed at step USER spawning /usr/sbin/whoami: Invalid argument
systemd[1]: run-p1013202-i5205932.service: Main process exited, code=exited, status=217/USER
systemd[1]: run-p1013202-i5205932.service: Failed with result 'exit-code'.
Before:
$ sudo build/systemd-run --scope --uid=asdf whoami
Failed to resolve user asdf: No such process
Now:
$ sudo build/systemd-run --scope --uid=asdf whoami
Failed to resolve user 'asdf': Unknown user
From a boot with a dracut initrd:
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:2: Failed to resolve user 'tss': No such process
systemd-tmpfiles[242]: Failed to parse ACL "default:group:tss:rwx", ignoring: Invalid argument
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:4: Failed to resolve user 'tss': No such process
systemd-tmpfiles[242]: Failed to parse ACL "default:group:tss:rwx", ignoring: Invalid argument
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:6: Failed to resolve group 'tss': No such process
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:7: Failed to resolve group 'tss': No such process
We cannot just use %m, because strerror returns a confusing error message
for ESRCH or ENOEXEC. udev code was doing a good job, but the error handling
was very verbose. Let's encapsulate the customized error messages in a
helper.
No functional change, except that the error messages have a slightly different
form now. The old messages were a bit better, but we don't have as much
flexibility in the new scheme. "Failed to resolve user 'foo': Unknown user"
should be good enough.
systemd-networkd cannot create the directory /run/systemd/resolve.hook/. Even
if the directory exists, it is not owned by systemd-network user/group, so
systemd-networkd cannot create socket file in the directory. Hence, if the
systemd-networkd-resolve-hook.socket unit is disabled, networkd fails to open
the varlink socket, and fail to start:
systemd-networkd[1304645]: Failed to bind to systemd-resolved hook Varlink socket: Permission denied
systemd-networkd[1304645]: Could not set up manager: Permission denied
systemd[1]: systemd-networkd.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: systemd-networkd.service: Failed with result 'exit-code'.
systemd[1]: Failed to start systemd-networkd.service - Network Management.
If the socket unit is disabled, that should mean the system administrator wants
to disable the feature. Let's not try to setup the varlink socket in that case.
Now the resolve hook feature can be toggled by enabling/disabling the socket
unit, let's drop the $SYSTEMD_NETWORK_RESOLVE_HOOK environment variable.
Follow-up for a7fa29b1b5.
Co-authored-by: Yu Watanabe <watanabe.yu+github@gmail.com>
The output of `systemd-analyze exit-status` changed in commit
e04ed6db6b, so that the exit-status class
for EXIT_SUCCESS and EXIT_FAILURE is "libc" instead of "glibc".
This commit makes the example output in the man-page match the actual
output again.
The semantics of reload is that the service updates its extrinsic state
and continues execution. If it actually deactivated we shouldn't
spuriously notify the caller that reload succeeded.
Installed jobs are always collapsed, i.e. can only be of types
accepted by job_run_and_invalidate() modulo JOB_NOP which is
stored in Unit.nop_job (if any). Let's trim the unreachable
branches.
This reverts commit 3ae7d8fd87.
Now utmp support is always disabled when building with musl,
and all definitions are unused in that case. Let's remove it.
This reverts commit bf9bc5beb0.
libutmps does not support utmpxname(), the function always fails
with ENOSYS, and always uses their own file.
However, our code relies on the funtion needs to succeed.
Let's revert the change now, and revisit later when musl users
request to support libutmps.
See https://gitlab.gnome.org/GNOME/glib/-/issues/2931 for the changes in
GLib upstream. Using `GMemoryMonitor` is now more compliant with the
systemd recommended approach, but it needs further work to read the
recommended environment variables rather than unconditionally accessing
the per-cgroup PSI kernel file directly.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>