glibc (and also musl, though we do not officially support it yet)
silently ignores colon prefix in $TZ. Let's always not prefix the
timezone.
tzset(3) states:
> A nonempty value of TZ can be one of two formats, either of which can
> be preceded by a colon which is ignored.
Addresses https://github.com/systemd/systemd/pull/38876#discussion_r2384347594.
Previously, an input string ends with short timezone spec e.g. WET,
was parsed by setting $TZ environment variable to the timezone.
But the timezone might be different from the original local timezone,
thus the result might not follow the timezone change in the original
local timezone.
This makes the check of the short timezone spec with tzname[] earlier,
then it is not necessary to load another timezone file for e.g. WET,
and provides expected time.
This also make it use SAVE_TIMEZONE macro and drop use of forking
process. This makes greatly improve performance when parsing string
that contains timezone different from the current local timezone.
Unfortunately, there is still one corner case that our test fails.
When tzdata is built with rearguard enabled, then at least
Africa/Windhoek timezone does not provide correct time, but time shifted
1 hour from the original.
Let's do a "soft" reset of the TTY when a ptyfwd session ends. This is a
good idea, in order to reset changes to the scrolling window that code
inside the session might have made. A "soft" reset will undo this.
While we are at it, make sure to output the ansi sequences for this
*after* terminating any half-written line, as that is still somewhat
contents of the session, even if it's augmented.
So far, when outputing information about copy progress we'd suppress the
digit after the dot if it is zero. That makes the progress bar a bit
"jumpy", because sometimes there are two more character cells used than
other times. Let's just always output one digit after the dot here
hence, to avoid this.
This partially reverts d6267b9b18
So, arch-chroot currently uses a rather cursed setup:
it sets up a PID namespace, but mounts /proc/ from the outside
into the chroot tree, and then call chroot(2), essentially
making it somewhere between chroot(8) and a full-blown
container. Hence, the PID dirs in /proc/ reveal the outer world.
The offending commit switched chroot detection to compare
/proc/1/root and /proc/OUR_PID/root, exhibiting the faulty behavior
where the mentioned environment now gets deemed to be non-chroot.
Now, this is very much an issue in arch-chroot. However,
if /proc/ is to be properly associated with the pidns,
then we'd treat it as a container and no longer a chroot.
Also, the previous logic feels more readable and more
honestly reported errors in proc_mounted(). Hence I opted
for reverting the change here. Still note that the culprit
(once again :/) lies in the arch-chroot's pidns impl, not
systemd.
Fixes https://gitlab.archlinux.org/archlinux/packaging/packages/systemd/-/issues/54
This removes iptables/libiptc backend support in firewall-util, as
already announced by 5c68c51045.
Then, this drops meaningless `FirewallContext` wrapper.
Depending on the packaging of tzdata, /usr/share/zoneinfo/tzdata.zi may
reference zones or links that are not actually present on the system.
E.g. on Debian and Ubuntu, there is a tzdata-legacy package that
contains "legacy" zones and links, but they are still referenced in
/usr/share/zoneinfo/tzdata.zi shipped by the main tzdata package.
Right now, get_timezoes() does not validate timezones when building the
list, which makes the following possible:
$ timedatectl list-timezones | grep "US/Alaska"
US/Alaska
$ timedatectl set-timezone US/Alaska
Failed to set time zone: Invalid or not installed time zone 'US/Alaska'
which feels buggy. Hence, simply validate timezones in get_timezones()
to avoid listing timezones that are not installed.
This is what the symlinkat.2 man page uses.
The old naming with 'to' and 'from', where 'to' is the symlink name
and 'from' is the symlink target is very confusing.
Follow-up for 892838911b.
We pin the parent directory of the specified directory via CHASE_PARENT,
but if we do that we really should mask off CHASE_MUST_BE_REGULAR,
because a parent dir of course is a dir, nothing else. The
CHASE_MUST_BE_REGULAR after all should apply to the file created in that
dir, not to the parent.
In https://github.com/systemd/systemd/issues/38842 it is reported that
we're again having trouble accessing EFI variables:
```
[ 292.212415] H (udev-worker)[253]: Reading EFI variable /sys/firmware/efi/efivars/LoaderDevicePartUUID-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f.
...
[ 344.397961] H (udev-worker)[253]: Detected slow EFI variable read access on LoaderDevicePartUUID-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f: 52.185510s
```
We don't know what causes the slowdown, but it seems reasonable to avoid
unnecessary read() calls. We would read the 4-byte attr first, and then
the actual value later. But our code always reads the value (and
discards the attr in all cases except one, when _writing_ the variable),
so let's optimize for the case where we read the value and read the
whole contents in one read().
This should allow us to get rid of a bunch of "fail:" labels, because we
can clean up tmpfiles relative to some atfd this way.
This only ports over a small number of potential users, but there's more
work to be done.
path_is_root_at() is supposed to detect if the inode referenced by the
specified fd is the "root inode". For that it checks if the inode and
its parent are the same inode and the same mount. Traditionally this
check was correct. But these days we actually have detached mounts (i.e.
those returned by fsmount() and related calls), whose root inode also
behaves like that.
Our uses for path_is_root_at() use the function to detect if an absolute
path would be identical to a relative path based on the specified fd
(sepifically: chaseat()), which goes really wrong if used on a detached
mount.
hence, let's adjust the function a bit, and let's go by path to "/" to
check if the referenced inode is the actual root inode in our chroot.
This was added in c242a08279, and AFAICT, the
code was never exercised, not even in the tests. With this chunk gone, if
anyone ever calls the function without any output params, we'll do open + fstat
instead of access, which will work just fine too.
I also thought about converting efi_set_variable() to use writev, but we don't
have loop_writev. I'm not sure if the loop around write here is important.
Coinceivably, it could make a difference it we were writing a long value.
The loop was introduced in b7749eb517, without
much comment unfortunately. So it doesn't seem worth the risk of changing this
to not use a loop, and writing loop_writev just for this also seems overkill.
In https://github.com/systemd/systemd/issues/38842 it is reported that we're again
having trouble accessing EFI variables:
[ 292.212415] H (udev-worker)[253]: Reading EFI variable /sys/firmware/efi/efivars/LoaderDevicePartUUID-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f.
...
[ 344.397961] H (udev-worker)[253]: Detected slow EFI variable read access on LoaderDevicePartUUID-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f: 52.185510s
We don't know what causes the slowdown, but it seems reasonable to avoid
unnecessary read() calls. We would read the 4-byte attr first, and then the
actual value later. But our code always reads the value (and discards the attr
in all cases except one, when _writing_ the variable), so let's optimize for
the case where we read the value and read the whole contents in one readv().
When SYSTEMD_COLORS is invalid, parse_systemd_colors() logs about it.
Logging helpers then call into parse_systemd_colors() to pretty-print
the log message, which then fails, so it logs about the failure,
rinse and repeat until segfault.
Follow-up for c8210d98a4
This reverts commit b177095bfa.
The original issue (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=375275,
https://github.com/systemd/systemd/issues/22168) was about having a block
cursor instead of a box cursor after VM reset, which doesn't seem particularly
urgent. OTOH, the patch causes a minor regression, where the splash screen is
cleared immediately and replaced by a blinking cursor. With the patch, we are
trading one visual issue for another visual issue. The second is probably more
noticeable, since some poeple put in quite a lot of work to have pretty boots
where the firmware splash screen is displayed until the login prompt pops up.
Avoiding a regression is more important than fixing a minor long-standing
issue, so let's revert this.
Fixes https://github.com/systemd/systemd/issues/38752.
Since c5de7b14ae
file searching implies a new mount api syscall by default,
to trigger automounts.
But, this is not necessary in most cases, e.g. when chasing
syspath in sd-device (actually this causes regression in umockdev,
see https://github.com/martinpitt/umockdev/issues/271).
Another example is reading unit files, especially .network files,
as automount may trigger mounting network filesystems...
Also, when this is used in NSS plugins, programs that load the
plugins may fail because of spuriously configured seccomp. See #38565.
Let's not trigger automount by default, and do only when explicitly
requested.
This introduces CHASE_TRIGGER_AUTOFS, and use it in
- service manager,
- bootctl and finding ESP/xbootldr,
- sysupdate,
- mountfsd,
- systemd-mount.
There may be several more places we should trigger automount, but let's
do that later.
Follow-up for c5de7b14ae.
Fixes#38565.
Replaces #38569.
Co-authored-by: Luca Boccassi <luca.boccassi@gmail.com>
This reverts commit 490aa05ca1.
As commented https://github.com/systemd/systemd/pull/38569#discussion_r2284978273,
the commit makes autofs check bypassed. Before the commit, when
CHASE_NO_AUTOFS is set, we did not shortcut chasing paths, and refused
any autofs mount points in the path. However, with the commit, the flag
was swapped but even when CHASE_AUTOFS is unset, the autofs check may be
skipped.
To fix the issue, rather than swapping the flag, we should introduce
another flag, say CHASE_TRIGGER_AUTOFS. This revert the commit, and in a
later commit, the new flag will be introduced.
Since c5de7b14ae
file searching implies a new mount api syscall by default,
to trigger automounts.
This is problematic in NSS plugins, as they are dlopen'ed inside
processes by glibc, for two reasons.
First of all, potentially searching on a networked filesystem
automount could lead to nasty surprises, such as the process
responsible for setting up the network filesystem trying to
search on that same filesystem.
More importantly, the new mount api syscall was never part of
the filesystem seccomp filter that we provide by default, and
given mounting/remounting/bind mounting is one of the possible
ways to bypass sandboxing it is very likely not allowed when
custom filters are used in sandboxed processes, if they don't
need to do these operations otherwise.
The filesystem seccomp mask we provide has been updated, however
this only takes effect on the next restart of a service. When
systemd is upgraded via a package upgrade, the new nss plugin is
installed and will be immediately dlopen'ed by glibc when needed,
without waiting for the process to restart, which means the existing
seccomp filter applies, causing the filter to trigger.
Given it's not really possible for any arbitrary program to
predict which NSS modules glibc will load, given programs do not
configure that and instead nsswitch is set up by the sysadmin,
it's impossible to handle at each process level. It's also not
possible to know when it will be triggered, given the plugin
is not linked in each binary tools like need-restart cannot
even pre-emptively restart services that may be affected.
This means in practice, upgrading from systemd << v258 to >= v258
requires a reboot to avoid either subtle or catastrophic system
failures.
By avoiding to trigger automounts in nss-systemd we can avoid
both issues.
userdb drop-ins are searched for in:
/etc/userdb/
/run/userdb/
/run/host/userdb/
/usr/local/lib/userdb/
/usr/lib/userdb/
none of which are supported as automounts anyway.
Note that this happens only when the userdbd service is not running,
as otherwise nss-systemd will go through the varlink IPC, rather than
doing the searches in-process.
So invert CHASE_NO_AUTOFS to CHASE_AUTOFS and set it in the places where
we do want to trigger automounts, like looking for the ESP.
Follow-up for c5de7b14ae
Fixes https://github.com/systemd/systemd/issues/38565
path_is_root_at() is supposed to detect if the inode referenced by the
specified fd is the "root inode". For that it checks if the inode and
its parent are the same inode and the same mount. Traditionally this
check was correct. But these days we actually have detached mounts (i.e.
those returned by fsmount() and related calls), whose root inode also
behaves like that.
Our uses for path_is_root_at() use the function to detect if an absolute
path would be identical to a relative path based on the specified fd
(specifically: chaseat()), which goes really wrong if used on a detached
mount.
hence, let's adjust the function a bit, and let's go by path to "/" to
check if the referenced inode is the actual root inode in our chroot.
The payload of a file_handle structure is not 64bit aligned. So far used
_alignas_() to align it to 64bit as a whole, which by accident has the
side-effect that the payload ends up being aligned to 64bit too, but
this is ugly, because it's really just an accident...
Let's do this properly, and just use proper unaligned 64bit reads to
access the field, and do not assume aligning the structure as a whole
also aligns the payload part of it.
Follow-up for: fd51a7d8b5
BLOCK_SIGNALS() is also used in nss modules. If an application is
running with a too strict seccomp loads our nss modules, then the
assertion may be triggered.
Fixes#38582.
Then, also make nss modules parse $SYSTEMD_ASSERT_RETURN_IS_CRITICAL
environment variable.
This also moves nss-util.c and nss-util.h from src/basic/ to src/shared/,
as they are not used by libsystemd.