Per-connection socket instances we currently maintain three fields
related to the socket: a reference to the Socket unit, the connection fd,
and a reference to the SocketPeer object that counts socket peers.
Let's synchronize their lifetime, i.e. always set them all three
together or unset them together, so that their reference counters stay
synchronous.
THis will in particuar ensure that we'll drop the SocketPeer reference
whenever we leave an active state of the service unit, i.e. at the same
time we close the fd for it.
Fixes: #20685
Previously the mkdir_label() family of calls was implemented in
src/shared/mkdir-label.c but its functions partly declared ins
src/shared/label.h and partly in src/basic/mkdir.h (!!). That's weird
(and wrong).
Let's clean this up, and add a proper mkdir-label.h matching the .c
file.
Up until now the main reason why we didn't proceed with starting the
unit was exceed start limit burst. However, for unit types like mounts
the other reason could be effective ratelimit on /proc/self/mountinfo
event source. That means our mount unit state may not reflect current
kernel state. Hence, we need to attempt to re-run the start job again
after ratelimit on event source expires.
As we will be introducing another reason than start limit let's rename
the virtual function that implements the check.
Alternative to https://github.com/systemd/systemd/pull/20531.
Whenever a service triggered by another unit fails condition checks,
stop the triggering unit to prevent systemd busy looping trying to
start the triggered unit.
Fixes#17433. Currently, if any of the validations we do before we
check start rate limiting fail, we can still enter a busy loop as
no rate limiting gets applied. A common occurence of this scenario
is path units triggering a service that fails a condition check.
To fix the issue, we simply move up start rate limiting checks to
be the first thing we do when starting a unit. To achieve this,
we add a new method to the unit vtable and implement it for the
relevant unit types so that we can do the start rate limit checks
earlier on.
In general we almost never hit those asserts in production code, so users see
them very rarely, if ever. But either way, we just need something that users
can pass to the developers.
We have quite a few of those asserts, and some have fairly nice messages, but
many are like "WTF?" or "???" or "unexpected something". The error that is
printed includes the file location, and function name. In almost all functions
there's at most one assert, so the function name alone is enough to identify
the failure for a developer. So we don't get much extra from the message, and
we might just as well drop them.
Dropping them makes our code a tiny bit smaller, and most importantly, improves
development experience by making it easy to insert such an assert in the code
without thinking how to phrase the argument.
The code to print unit status formats had a long history, and became a
hard-to-manage mess of duplicate code parts. We would use sprintf() to
format a string, and then call sprintf() again… The code is reworked
to avoid repeated formattings and to streamline printing to the log
and the console.
The approach used in this patch is a bit more complex then in patches by Colin
Walter and Paweł Marciniak, because an allocation is only done if "combined"
format is used. In other cases we return the existing ->id or ->description
strings. The caller can also control whether a shorter or longer status string
should be used. This way the caller can use a shorter format where it makes
sense, for example in the cylon eye output, where we don't have enough
horizontal space.
Patch is based on Colin Walters' https://github.com/systemd/systemd/pull/15957,
and Paweł Marciniak's patch posted on fedora-devel.
Note: for some reason, the functions for printing of start and stop messages
were sepearated by some unrelated functions. They are moved to be consecutive,
but this makes the much more verbose than it would be otherwise. I found it
useful to view in gitk's "new" mode.
Co-authored-by: Colin Walters <walters@verbum.org>
Co-authored-by: Paweł Marciniak <sunwire+git@gmail.com>
Output from a Fedora Rawhide container boot (w/ some follow-up patches to
tweak Descriptions):
Welcome to Fedora 35 (Rawhide Prerelease)!
Queued start job for default target graphical.target.
[ OK ] Created slice system-getty.slice - Slice /system/getty.
[ OK ] Created slice system-modprobe.slice - Slice /system/modprobe.
[ OK ] Created slice system-sshd\x2dkeygen.slice - Slice /system/sshd-keygen.
[ OK ] Created slice user.slice - User and Session Slice.
[ OK ] Started systemd-ask-password-console.path - Dispatch Password Requests to Console Directory Watch.
[ OK ] Started systemd-ask-password-wall.path - Forward Password Requests to Wall Directory Watch.
[ OK ] Reached target cryptsetup.target - Local Encrypted Volumes.
[ OK ] Reached target paths.target - Path Units.
[ OK ] Reached target remote-cryptsetup.target - Remote Encrypted Volumes.
[ OK ] Reached target remote-fs.target - Remote File Systems.
[ OK ] Reached target slices.target - Slice Units.
[ OK ] Reached target swap.target - Swaps.
[ OK ] Reached target veritysetup.target - Local Verity Integrity Protected Volumes.
[ OK ] Listening on systemd-coredump.socket - Process Core Dump Socket.
[ OK ] Listening on systemd-initctl.socket - initctl Compatibility Named Pipe.
[ OK ] Listening on systemd-journald-dev-log.socket - Journal Socket (/dev/log).
[ OK ] Listening on systemd-journald.socket - Journal Socket.
[ OK ] Listening on systemd-networkd.socket - Network Service Netlink Socket.
[ OK ] Listening on systemd-userdbd.socket - User Database Manager Socket.
Mounting dev-hugepages.mount - Huge Pages File System...
Starting systemd-journald.service - Journal Service...
Starting systemd-remount-fs.service - Remount Root and Kernel File Systems...
Starting systemd-sysctl.service - Apply Kernel Variables...
[ OK ] Mounted dev-hugepages.mount - Huge Pages File System.
[ OK ] Finished systemd-remount-fs.service - Remount Root and Kernel File Systems.
Starting systemd-hwdb-update.service - Rebuild Hardware Database...
Starting systemd-sysusers.service - Create System Users...
[ OK ] Finished systemd-sysctl.service - Apply Kernel Variables.
[ OK ] Started systemd-journald.service - Journal Service.
Starting systemd-journal-flush.service - Flush Journal to Persistent Storage...
[ OK ] Finished systemd-sysusers.service - Create System Users.
Starting systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev...
[ OK ] Finished systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev.
[ OK ] Reached target local-fs-pre.target - Preparation for Local File Systems.
[ OK ] Reached target local-fs.target - Local File Systems.
[ OK ] Reached target machines.target - Containers.
Starting dracut-shutdown.service - Restore /run/initramfs on shutdown...
Starting ldconfig.service - Rebuild Dynamic Linker Cache...
[ OK ] Finished dracut-shutdown.service - Restore /run/initramfs on shutdown.
[ OK ] Finished ldconfig.service - Rebuild Dynamic Linker Cache.
[ OK ] Finished systemd-journal-flush.service - Flush Journal to Persistent Storage.
Starting systemd-tmpfiles-setup.service - Create Volatile Files and Directories...
[ OK ] Finished systemd-tmpfiles-setup.service - Create Volatile Files and Directories.
Starting systemd-journal-catalog-update.service - Rebuild Journal Catalog...
Starting systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer...
Starting systemd-update-utmp.service - Update UTMP about System Boot/Shutdown...
Starting systemd-userdbd.service - User Database Manager...
[ OK ] Finished systemd-update-utmp.service - Update UTMP about System Boot/Shutdown.
[ OK ] Finished systemd-journal-catalog-update.service - Rebuild Journal Catalog.
[ OK ] Started systemd-userdbd.service - User Database Manager.
[ OK ] Started systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer.
[ OK ] Finished systemd-hwdb-update.service - Rebuild Hardware Database.
Starting systemd-networkd.service - Network Configuration...
Starting systemd-update-done.service - Update is Completed...
[ OK ] Finished systemd-update-done.service - Update is Completed.
[ OK ] Reached target sysinit.target - System Initialization.
[ OK ] Started dnf-makecache.timer - dnf makecache --timer.
[ OK ] Started logrotate.timer - Daily rotation of log files.
[ OK ] Started systemd-tmpfiles-clean.timer - Daily Cleanup of Temporary Directories.
[ OK ] Reached target timers.target - Timer Units.
[ OK ] Listening on dbus.socket - D-Bus System Message Bus Socket.
[ OK ] Reached target sockets.target - Socket Units.
[ OK ] Reached target basic.target - Basic System.
[ OK ] Reached target sshd-keygen.target.
Starting sysstat.service - Resets System Activity Logs...
Starting systemd-homed.service - Home Area Manager...
Starting systemd-logind.service - User Login Management...
Starting dbus-broker.service - D-Bus System Message Bus...
[FAILED] Failed to start sysstat.service - Resets System Activity Logs.
See 'systemctl status sysstat.service' for details.
[ OK ] Started dbus-broker.service - D-Bus System Message Bus.
[ OK ] Started systemd-homed.service - Home Area Manager.
[ OK ] Finished systemd-homed-activate.service - Home Area Activation.
[ OK ] Started systemd-logind.service - User Login Management.
[ OK ] Started systemd-networkd.service - Network Configuration.
Starting systemd-networkd-wait-online.service - Wait for Network to be Configured...
Starting systemd-resolved.service - Network Name Resolution...
[ OK ] Started systemd-resolved.service - Network Name Resolution.
[ OK ] Reached target network.target - Network.
[ OK ] Reached target nss-lookup.target - Host and Network Name Lookups.
Starting sshd.service - OpenSSH server daemon...
Starting systemd-user-sessions.service - Permit User Sessions...
[ OK ] Finished systemd-user-sessions.service - Permit User Sessions.
[ OK ] Started console-getty.service - Console Getty.
[ OK ] Reached target getty.target - Login Prompts.
[ OK ] Started sshd.service - OpenSSH server daemon.
[ OK ] Reached target multi-user.target - Multi-User System.
[ OK ] Reached target graphical.target - Graphical Interface.
Starting systemd-update-utmp-runlevel.service - Update UTMP about System Runlevel Changes...
[ OK ] Finished systemd-update-utmp-runlevel.service - Update UTMP about System Runlevel Changes.
Fedora 35 (Rawhide Prerelease)
Kernel 5.12.12-300.fc34.x86_64 on an x86_64 (console)
rawhide login: [ OK ] Stopped session-24.scope - Session 24 of User zbyszek.
[ OK ] Removed slice system-getty.slice - Slice /system/getty.
[ OK ] Removed slice system-modprobe.slice - Slice /system/modprobe.
[ OK ] Removed slice system-sshd\x2dkeygen.slice - Slice /system/sshd-keygen.
[ OK ] Stopped target graphical.target - Graphical Interface.
[ OK ] Stopped target multi-user.target - Multi-User System.
[ OK ] Stopped target getty.target - Login Prompts.
[ OK ] Stopped target machines.target - Containers.
[ OK ] Stopped target nss-lookup.target - Host and Network Name Lookups.
[ OK ] Stopped target remote-cryptsetup.target - Remote Encrypted Volumes.
[ OK ] Stopped target timers.target - Timer Units.
[ OK ] Stopped dnf-makecache.timer - dnf makecache --timer.
[ OK ] Stopped logrotate.timer - Daily rotation of log files.
[ OK ] Stopped systemd-tmpfiles-clean.timer - Daily Cleanup of Temporary Directories.
[ OK ] Closed systemd-coredump.socket - Process Core Dump Socket.
Stopping console-getty.service - Console Getty...
Stopping dracut-shutdown.service - Restore /run/initramfs on shutdown...
Stopping sshd.service - OpenSSH server daemon...
Stopping systemd-logind.service - User Login Management...
Stopping systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer...
Stopping user@1000.service - User Manager for UID 1000...
[ OK ] Stopped systemd-oomd.service - Userspace Out-Of-Memory (OOM) Killer.
[ OK ] Stopped systemd-networkd-wait-online.service - Wait for Network to be Configured.
[ OK ] Stopped sshd.service - OpenSSH server daemon.
[ OK ] Stopped console-getty.service - Console Getty.
[ OK ] Stopped dracut-shutdown.service - Restore /run/initramfs on shutdown.
[ OK ] Stopped target sshd-keygen.target.
[ OK ] Stopped systemd-logind.service - User Login Management.
[ OK ] Stopped user@1000.service - User Manager for UID 1000.
Stopping user-runtime-dir@1000.service - User Runtime Directory /run/user/1000...
[ OK ] Unmounted run-user-1000.mount - /run/user/1000.
[ OK ] Stopped user-runtime-dir@1000.service - User Runtime Directory /run/user/1000.
[ OK ] Removed slice user-1000.slice - User Slice of UID 1000.
Stopping systemd-user-sessions.service - Permit User Sessions...
[ OK ] Stopped systemd-user-sessions.service - Permit User Sessions.
[ OK ] Stopped target network.target - Network.
[ OK ] Stopped target remote-fs.target - Remote File Systems.
Stopping systemd-homed-activate.service - Home Area Activation...
Stopping systemd-resolved.service - Network Name Resolution...
[ OK ] Stopped systemd-resolved.service - Network Name Resolution.
Stopping systemd-networkd.service - Network Configuration...
[ OK ] Stopped systemd-homed-activate.service - Home Area Activation.
Stopping systemd-homed.service - Home Area Manager...
[ OK ] Stopped systemd-homed.service - Home Area Manager.
[ OK ] Stopped target basic.target - Basic System.
[ OK ] Stopped target paths.target - Path Units.
[ OK ] Stopped target slices.target - Slice Units.
[ OK ] Removed slice user.slice - User and Session Slice.
[ OK ] Stopped target sockets.target - Socket Units.
Stopping dbus-broker.service - D-Bus System Message Bus...
[ OK ] Stopped dbus-broker.service - D-Bus System Message Bus.
[ OK ] Closed dbus.socket - D-Bus System Message Bus Socket.
[ OK ] Stopped target sysinit.target - System Initialization.
[ OK ] Stopped target cryptsetup.target - Local Encrypted Volumes.
[ OK ] Stopped systemd-ask-password-console.path - Dispatch Password Requests to Console Directory Watch.
[ OK ] Stopped systemd-ask-password-wall.path - Forward Password Requests to Wall Directory Watch.
[ OK ] Stopped target veritysetup.target - Local Verity Integrity Protected Volumes.
[ OK ] Stopped systemd-update-done.service - Update is Completed.
[ OK ] Stopped ldconfig.service - Rebuild Dynamic Linker Cache.
[ OK ] Stopped systemd-hwdb-update.service - Rebuild Hardware Database.
[ OK ] Stopped systemd-journal-catalog-update.service - Rebuild Journal Catalog.
Stopping systemd-update-utmp.service - Update UTMP about System Boot/Shutdown...
[ OK ] Stopped systemd-networkd.service - Network Configuration.
[ OK ] Closed systemd-networkd.socket - Network Service Netlink Socket.
[ OK ] Stopped systemd-sysctl.service - Apply Kernel Variables.
[ OK ] Stopped systemd-update-utmp.service - Update UTMP about System Boot/Shutdown.
[ OK ] Stopped systemd-tmpfiles-setup.service - Create Volatile Files and Directories.
[ OK ] Stopped target local-fs.target - Local File Systems.
Unmounting home.mount - /home...
Unmounting run-credentials-systemd\x2dsysusers.se…e.mount - /run/credentials/systemd-sysusers.service...
Unmounting tmp.mount - Temporary Directory /tmp...
[ OK ] Unmounted home.mount - /home.
[ OK ] Unmounted tmp.mount - Temporary Directory /tmp.
[ OK ] Unmounted run-credentials-systemd\x2dsysusers.service.mount - /run/credentials/systemd-sysusers.service.
[ OK ] Stopped target local-fs-pre.target - Preparation for Local File Systems.
[ OK ] Stopped target swap.target - Swaps.
[ OK ] Reached target umount.target - Unmount All Filesystems.
[ OK ] Stopped systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev.
[ OK ] Stopped systemd-sysusers.service - Create System Users.
[ OK ] Stopped systemd-remount-fs.service - Remount Root and Kernel File Systems.
[ OK ] Reached target shutdown.target - System Shutdown.
[ OK ] Reached target final.target - Late Boot Services.
[ OK ] Finished systemd-poweroff.service - System Power Off.
[ OK ] Reached target poweroff.target - System Power Off.
Sending SIGTERM to remaining processes...
Sending SIGKILL to remaining processes...
All filesystems, swaps, loop devices, MD devices and DM devices detached.
Powering off.
This mirrors the change done for systemd-resolved in
9793530228. Quoting that patch:
> We generally operate on the assumption that a source is "gone" as soon as we
> unref it. This is generally true because we have the only reference. But if
> something else holds the reference, our unref doesn't really stop the source
> and it could fire again.
In particular, we take temporary references from sd-event code, and when called
from an sd-event callback, we could temporarily see this elevated reference
count. This patch doesn't seem to change anything, but I think it's nicer to do
the same change as in other places and not rely on _unref() immediately
disabling the source.
If the cleanup function returns the appropriate type, use that to reset the
variable. For other functions (usually the foreign ones which return void), add
an explicit value to reset to.
This causes a bit of code churn, but I think it might be worth it. In a
following patch static destructors will be called from a fuzzer, and this
change allows them to be called multiple times. But I think such a change might
help with detecting unitialized code reuse too. We hit various bugs like this,
and things are more obvious when a pointer has been set to NULL.
I was worried whether this change increases text size, but it doesn't seem to:
-Dbuildtype=debug:
before "tree-wide: return NULL from freeing functions":
-rwxrwxr-x 1 zbyszek zbyszek 4117672 Feb 16 14:36 build/libsystemd.so.0.30.0*
-rwxrwxr-x 1 zbyszek zbyszek 4494520 Feb 16 15:06 build/systemd*
after "tree-wide: return NULL from freeing functions":
-rwxrwxr-x 1 zbyszek zbyszek 4117672 Feb 16 14:36 build/libsystemd.so.0.30.0*
-rwxrwxr-x 1 zbyszek zbyszek 4494576 Feb 16 15:10 build/systemd*
now:
-rwxrwxr-x 1 zbyszek zbyszek 4117672 Feb 16 14:36 build/libsystemd.so.0.30.0*
-rwxrwxr-x 1 zbyszek zbyszek 4494640 Feb 16 15:15 build/systemd*
-Dbuildtype=release:
before "tree-wide: return NULL from freeing functions":
-rwxrwxr-x 1 zbyszek zbyszek 5252256 Feb 14 14:47 build-rawhide/libsystemd.so.0.30.0*
-rwxrwxr-x 1 zbyszek zbyszek 1834184 Feb 16 15:09 build-rawhide/systemd*
after "tree-wide: return NULL from freeing functions":
-rwxrwxr-x 1 zbyszek zbyszek 5252256 Feb 14 14:47 build-rawhide/libsystemd.so.0.30.0*
-rwxrwxr-x 1 zbyszek zbyszek 1834184 Feb 16 15:10 build-rawhide/systemd*
now:
-rwxrwxr-x 1 zbyszek zbyszek 5252256 Feb 14 14:47 build-rawhide/libsystemd.so.0.30.0*
-rwxrwxr-x 1 zbyszek zbyszek 1834184 Feb 16 15:16 build-rawhide/systemd*
I would expect that the compiler would be able to elide the setting of a
variable if the variable is never used again. And this seems to be the case:
in optimized builds there is no change in size whatsoever. And the change in
size in unoptimized build is negligible.
Something strange is happening with size of libsystemd: it's bigger in
optimized builds. Something to figure out, but unrelated to this patch.
I started working on this because I wanted to change how
DEFINE_TRIVIAL_CLEANUP_FUNC is defined. Even independently of that change, it's
nice to make make things more consistent and predictable.
This adds a way to control SO_TIMESTAMP/SO_TIMESTAMPNS socket options
for sockets PID 1 binds to.
This is useful in journald so that we get proper timestamps even for
ingress log messages that are submitted before journald is running.
We recently turned on packet info metadata from PID 1 for these sockets,
but the timestamping info was still missing. Let's correct that.
If the whole call is simple and we don't need to look at the return value
apart from the conditional, let's use a form without assignment of the return
value. When the function call is more complicated, it still makes sense to
use a temporary variable.
In 4c2ef32767 we enabled propagating
triggered unit state to the triggering unit for service units in more
load states, so that we don't accidentally stop tracking state
correctly.
Do the same for our other triggering unit states: automounts, paths, and
timers.
Also, make this an assertion rather than a simple test. After all it
should never happen that we get called for half-loaded units or units of
the wrong type. The load routines should already have made this
impossible.
In containers we might lack the privs to up the socket buffers. Let's
not complain so loudly about that. Let's hence downgrade this to debug
logging if it's a permission problem.
(This wasn't an issue before b92f350789
because back then the failures wouldn't be detected at all.)
A variety of sockopts exist both for IPv4 and IPv6 but require a
different pair of sockopt level/option number. Let's add helpers for
these that internally determine the right sockopt to call.
This should shorten code that generically wants to support both ipv4 +
ipv6 and for the first time adds correct support for some cases where we
only called the ipv4 versions, and not the ipv6 options.
socket_instantiate_service() was doing unit_ref_set(), and the caller was
immediately doing unit_ref_unset(). After we get rid of this, it doesn't seem
worth it to have two functions.
This means that the connection was aborted before we even got to figure out
what the service name will be. Let's treat this as a non-event and close the
connection fd without any further messages.
Code last changed in 934ef6a5.
Reported-by: Thiago Macieira <thiago.macieira@intel.com>
With the patch:
systemd[1]: foobar.socket: Incoming traffic
systemd[1]: foobar.socket: Got ENOTCONN on incoming socket, assuming aborted connection attempt, ignoring.
...
Also, when we get ENOMEM, don't give the hint about missing unit.
Upon an incoming connection for an accepting socket, we'd create a unit like
foo@0.service, then figure out that the instance name should be e.g. "0-41-0",
and then add the name foo@0-41-0.service to the unit. This obviously violates
the rule that any service needs to have a constance instance part.
So let's reverse the order: we first determine the instance name and then
create the unit with the correct name from the start.
There are two cases where we don't know the instance name:
- analyze-verify: we just do a quick check that the instance unit can be
created. So let's use a bogus instance string.
- selinux: the code wants to load the service unit to extract the ExecStart path
and query it for the selinux label. Do the same as above.
Note that in both cases it is possible that the real unit that is loaded could
be different than the one with the bogus instance value, for example if there
is a dropin for a specific instance name. We can't do much about this, since we
can't figure out the instance name in advance. The old code had the same
shortcoming.