All messages logged from exec_spawn() are attributed to the unit
and as such we should set the log level to the unit's max log level
for the duration of the function.
Currently, IS_SYNTHETIC_ERRNO() evaluates to true for all negative errnos,
because of the two's-complement negative value representation.
Subsequently, ERRNO= is not logged for most of our own code.
Let's fix this, by formatting all synthetic errnos as positive.
Then, treat all negative values as non-synthetic.
While at it, mark the evaluation order explicitly, and remove
unneeded comment.
Fixes#33800
If we're already running in a unit with delegation turned on, let's
skip allocation of a scope unit and cgroup subroot. This allows journald
to correctly attribute the logs of all subprocesses spawned by tests such
as test-execute to the test-execute service when the test is running in a service.
For CI in mkosi, I want to configure systemd to log at debug level
to the journal, but not to the console. While we already have max
level settings for journald's forwarding settings, not every log line
goes to the journal, specifically during early boot and when units
are connected directly to the console (think systemd-firstboot), so
let's extend the log level options we already have to allow specifying
a comma separated list of values and lets allow prefixing values with
the log target they apply to to make this possible.
To make the order matches with log_internal().
No functional change. Hopefully silence coverity issues like
CID#1534478, CID#1534479, CID#1534480, CID#1534482.
LOG_SET_PREFIX() sets a logging prefix for the current block. The
prefix is prepended to every logging message in the block, followed
by ": ". If a logging prefix is already configured, it is overridden
for the duration of the block, after which it is restored.
A use case for this macro is when we're operating on an image or
directory (using --root or --image). We can use LOG_SET_PREFIX() to
prefix all logging messages with the directory or image that we're
operating on.
Whenever we're going to close all file descriptors, we tend to close
the log and set it into open when needed mode. When this is done with
the logging target set to LOG_TARGET_AUTO, we run into issues because
for every logging call, we'll check if stderr is connected to the
journal to determine where to send the logging message. This check
obviously stops working when we close stderr, so we settle the log
target before we do that so that we keep using the same logging
target even after stderr is closed.
Now that we have reference counting, it's useful to be able to push
single key values onto the log context separately, so that we don't
have to allocate new storage to join the separate string together into
a single field which means we won't be able to reuse a context containing
the same field.
Let's try to optimize against pushing the same fields multiple times
onto the log context. To achieve this we make the log context reference
counted and return an existing context object if it's using the same
fields.
A consequence of this is that we have to make sure attaching/detaching
is coupled to the lifetime of the context object, so we make the attach
and detach functions private for now. If we need independent attach/detach
in the future, we can make that work with some extra complexity but since
we don't need it yet, let's not support it for now.
This commit adds support for attaching extra metadata to log
messages written to the journal via log.h. We keep track of a
thread local log context in log.c onto which we can push extra
metadata fields that should be logged. Once a field is no longer
relevant, it can be popped again from the log context.
On top of this, we then add macros to allow pushing extra fields
onto the log context.
LOG_CONTEXT_PUSH() will push the provided field onto the log context
and pop the last field from the log context when the current block
ends. LOG_CONTEXT_PUSH_STRV() will do the same but for all fields in
the given strv.
Using the macros is as simple as putting them anywhere inside a block
to add a field to all following log messages logged from inside that
block.
void myfunction(...) {
...
LOG_CONTEXT_PUSH("MYMETADATA=abc");
// Every journal message logged will now have the MYMETADATA=abc
// field included.
}
For convenience, there's also LOG_CONTEXT_PUSHF() to allow constructing
the field to be logged using printf() syntax.
log_context_new()/log_context_free() can be used to attach a log context
to an async operation by storing it in the associated userdata struct.
With an intentional mistake:
../src/login/logind-dbus.c: In function ‘bus_manager_log_shutdown’:
../src/login/logind-dbus.c:1542:39: error: format ‘%s’ expects a matching ‘char *’ argument [-Werror=format=]
1542 | LOG_MESSAGE("%s %s", message),
| ^~~~~~~
Also break some long lines for more uniform formatting. No functional change.
I went over all log_struct, log_struct_errno, log_unit_struct,
log_unit_struct_errno calls, and they seem fine.
log_parse_environment() uses should_parse_proc_cmdline() to determine whether
it should parse settings from the kernel command line. But the checks that
should_parse_proc_cmdline() apply to the whole process, and we could get a positive
answer also when log_parse_environment() was called from one of the nss modules.
In case of nss-modules, we don't want to look at the kernel command line.
log_parse_environment_variables() that only looks at the environment variables
is split out and used in the nss modules.
Fixes#22020.
When the support for "synthetic errno" was added, we started truncating
the errno value to just the least significant byte. This is generally OK,
because errno values are defined up to ~130.
The docs don't really say what the maximum value is. But at least in principle
higher values could be added in the future. So let's stop truncating
the values needlessly.
The kernel (or libbpf?) have an error where they return 524 as an errno
value (https://bugzilla.redhat.com/show_bug.cgi?id=2036145). We would
confusingly truncate this to 12 (ENOMEM). It seems much nicer to let
strerror() give us "Unknown error 524" rather than to print the bogus
message about ENOMEM.
Previously, we discarded any kmsg messages coming from journald
itself to avoid infinite loops where potentially the processing
of a kmsg message causes journald to log one or more messages to
kmsg which then get read again by the kmsg handler, ...
However, if we completely disable logging whenever we're processing
a kmsg message coming from journald itself, we also prevent any
infinite loops as we can be sure that journald won't accidentally
generate logging messages while processing a kmsg log message.
This change allows us to store all journald logs generated during
the processing of log messages from other services in the system
journal. Previously these could only be found in kmsg which has
low retention, can't be queried using journalctl and whose logs
don't survive reboots.
The commit introduces a callback invoked from log_syntax_internal.
Use it from systemd-analyze to gather a list of units that contain
syntax warnings. A new command line option is added to make use of this.
The new option --recursive-errors takes in three possible modes:
1. yes - which is the default. systemd-analyze exits with an error when syntax warnings arise during verification of the
specified units or any of their dependencies.
3. no - systemd-analyze exits with an error when syntax warnings arise during verification of only the selected unit.
Analyzing and loading any dependencies will be skipped.
4. one - systemd-analyze exits with an error when syntax warnings arise during verification
of only the selected units and their direct dependencies.
Below are two service unit files that I created for the purposes of testing:
1. First, we run the commands on a unit that does not have dependencies but has a non-existing key-value setting (i.e. foo = bar).
> cat <<EOF>testcase.service
[Unit]
foo = bar
[Service]
ExecStart = echo hello
EOF
OUTPUT:
maanya-goenka@debian:~/systemd (log-error)$ sudo build/systemd-analyze verify testcase.service
/home/maanya-goenka/systemd/testcase.service:2: Unknown key name 'foo' in section 'Unit', ignoring.
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
maanya-goenka@debian:~/systemd (log-error)$ echo $?
1
maanya-goenka@debian:~/systemd (log-error)$ sudo build/systemd-analyze verify --recursive-errors=yes testcase.service
/home/maanya-goenka/systemd/testcase.service:2: Unknown key name 'foo' in section 'Unit', ignoring.
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
maanya-goenka@debian:~/systemd (log-error)$ echo $?
1
maanya-goenka@debian:~/systemd (log-error)$ sudo build/systemd-analyze verify --recursive-errors=no testcase.service
/home/maanya-goenka/systemd/testcase.service:2: Unknown key name 'foo' in section 'Unit', ignoring.
maanya-goenka@debian:~/systemd (log-error)$ echo $?
1
maanya-goenka@debian:~/systemd (log-error)$ sudo build/systemd-analyze verify --recursive-errors=one testcase.service
/home/maanya-goenka/systemd/testcase.service:2: Unknown key name 'foo' in section 'Unit', ignoring.
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
maanya-goenka@debian:~/systemd (log-error)$ echo $?
1
2. Next, we run the commands on a unit that is syntactically valid but has a non-existing dependency (i.e. foo2.service)
> cat <<EOF>foobar.service
[Unit]
Requires = foo2.service
[Service]
ExecStart = echo hello
EOF
OUTPUT:
maanya-goenka@debian:~/systemd (log-error)$ sudo build/systemd-analyze verify foobar.service
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
foobar.service: Failed to create foobar.service/start: Unit foo2.service not found.
maanya-goenka@debian:~/systemd (log-error)$ echo $?
1
maanya-goenka@debian:~/systemd (log-error)$ sudo build/systemd-analyze verify --recursive-errors=yes foobar.service
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
foobar.service: Failed to create foobar.service/start: Unit foo2.service not found.
maanya-goenka@debian:~/systemd (log-error)$ echo $?
1
maanya-goenka@debian:~/systemd (log-error)$ sudo build/systemd-analyze verify --recursive-errors=no foobar.service
maanya-goenka@debian:~/systemd (log-error)$ echo $?
0
maanya-goenka@debian:~/systemd (log-error)$ sudo build/systemd-analyze verify --recursive-errors=one foobar.service
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
foobar.service: Failed to create foobar.service/start: Unit foo2.service not found.
maanya-goenka@debian:~/systemd (log-error)$ echo $?
1
In general we almost never hit those asserts in production code, so users see
them very rarely, if ever. But either way, we just need something that users
can pass to the developers.
We have quite a few of those asserts, and some have fairly nice messages, but
many are like "WTF?" or "???" or "unexpected something". The error that is
printed includes the file location, and function name. In almost all functions
there's at most one assert, so the function name alone is enough to identify
the failure for a developer. So we don't get much extra from the message, and
we might just as well drop them.
Dropping them makes our code a tiny bit smaller, and most importantly, improves
development experience by making it easy to insert such an assert in the code
without thinking how to phrase the argument.
These macros will log a message at the specified level only the first time
they are called. On all later calls, if the specified level is debug, the
logs will be suppressed; otherwise the message will be logged at debug.
As @yuwata correctly points out, this became broken when log_debug()
started returning -EIO. I wanted to preserve this pattern, but it turns
out it is not very widely used, and preserving it would make the whole
thing, already quite complicated, even more complex.
log_debug() is made like log_info() and friends, and returns void.
Let's assert if we ever happen to pass 0 to one of the log functions.
With the preceding commit to return -EIO from log_*(), passing 0 wouldn't
affect the return value any more, but it is still most likely an error.
The unit test code is an exception: we fairly often pass the return value
to print it, before checking what it is. So let's assert that we're not
passing 0 in non-test code. As with the previous check for %m, this is only
done in developer mode. We are depending on external code setting
errno correctly for us, which might not always be true, and which we can't
test, so we shouldn't assert, but just handle this gracefully.
I did a bunch of greps to try to figure out if there are any places where
we're passing 0 on purpose, and couldn't find any.
The one place that failed in tests is adjusted.
About "zerook" in the name: I wanted the suffix to be unambiguous. It's a
single "word" because each of the words in log_full_errno is also meaningful,
and having one term use two words would be confusing.
This is only done in developer mode. It is a pretty rare occurence that we
make this kind of mistake. And even if it happens, the result is just a misleading
error message. So let's only do the check in non-release builds.
This silences some warnigns where gcc thinks that some variables are
unitialized. One particular case:
../src/journal/journald-server.c: In function 'ache_space_refresh':
../src/journal/journald-server.c:136:28: error: 'vfs_avail' may be used uninitialized in this function [-Werror=maybe-uninitialized]
136 | uint64_t vfs_used, vfs_avail, avail;
| ^~~~~~~~~
../src/journal/journald-server.c:136:18: error: 'vfs_used' may be used uninitialized in this function [-Werror=maybe-uninitialized]
136 | uint64_t vfs_used, vfs_avail, avail;
| ^~~~~~~~
cc1: all warnings being treated as errors
which is caused by
d = opendir(path);
if (!d)
return log_full_errno(errno == ENOENT ? LOG_DEBUG : LOG_ERR,
errno, "Failed to open %s: %m", path);
if (fstatvfs(dirfd(d), &ss) < 0)
return log_error_errno(errno, "Failed to fstatvfs(%s): %m", path);
For some reason on aarch64 gcc thinks we might return non-negative here. In
principle errno must be set in both cases, but it's hard to say for certain.
So let's make sure that our code flow is correct, even if somebody forgot to
set the global variable somewhere.
As suggested in https://github.com/systemd/systemd/pull/11484#issuecomment-775288617.
This does not touch anything exposed in src/systemd. Changing the defines there
would be a compatibility break.
Note that tests are broken after this commit. They will be fixed in the next one.
This effectively reverts 41d1f469cf.
Before this, e.g., `networkctl reload` invoked by `systemctl reload systemd-networkd.service`
does not produce debugging logs even if systemd.log-level=debug is set. This fixes
the issue.