Since v256 we completely fail to boot if v1 is configured. Fedora 41 was just
released with v256.7 and this is probably the first major exposure of users to
this code. It turns out not work very well. Fedora switched to v2 as default in
F31 (2019) and at that time some people added configuration to use v1 either
because of Docker or for other reasons. But it's been long enough ago that
people don't remember this and are now very unhappy when the system refuses to
boot after an upgrade.
Refusing to boot is also unnecessarilly punishing to users. For machines that
are used remotely, this could mean somebody needs to physically access the
machine. For other users, the machine might be the only way to access the net
and help, and people might not know how to set kernel parameters without some
docs. And because this is in systemd, after an upgrade all boot choices are
affected, and it's not possible to e.g. select an older kernel for boot. And
crashing the machine doesn't really serve our goal either: we were giving a
hint how to continue using v1 and nothing else.
If the new override is configured, warn and immediately boot to v1.
If v1 is configured w/o the override, warn and wait 30 s and boot to v2.
Also give a hint how to switch to v2.
https://bugzilla.redhat.com/show_bug.cgi?id=2323323https://bugzilla.redhat.com/show_bug.cgi?id=2323345https://bugzilla.redhat.com/show_bug.cgi?id=2322467https://www.reddit.com/r/Fedora/comments/1gfcyw9/refusing_to_run_under_cgroup_01_sy_specified_on/
The advice is to set systemd.unified_cgroup_hierarchy=1 (instead of removing
systemd.unified_cgroup_hierarchy=0). I think this is easier to convey. Users
who are understand what is going on can just remove the option instead.
The caching is dropped in cg_is_legacy_wanted(). It turns out that the
order in which those functions are called during early setup is very fragile.
If cg_is_legacy_wanted() is called before we have set up the v2 hierarchy,
we incorrectly cache a true answer. The function is called just a handful
of times at most, so we don't really need to cache the response.
Use FOREACH_ELEMENT where possible. Generated with this command,
and checked manually:
git grep -l 'FOREACH_ARRAY.*ELEMENTSOF' | \
xargs sed -ri 's/FOREACH_ARRAY\((.*), (.*), (ELEMENTSOF.*)\)/FOREACH_ELEMENT(\1, \2)/'
This creates a new dir /run/systemd/mount-rootfs/ early in PID 1 that
thus always exists. It's supposed to be used by any code that creates
its own mount namespace and then sets up a new root dir to switch into.
So far in many cases we used a temporary dir (which needed explicit
clean-up) or a purpose-specific fixed dir.
Let's create a common dir instead, that always exists (as it is created
in PID 1 early on, always).
Besides making things more robust, as manual clean-up of the inode is
not necessary anymore this also opens the door for unprivileged programs
to use the same dir, since it now always exists.
Set the access mode to 555 (instead of the otherwise previously used
0755, 0700 or similar), so that unprivileged programs can access it, but
we make clear it's not supposed to be written directly to, by anyone,
not even root.
This is an octal number. We used the 0 prefix in some places inconsistently.
The kernel always interprets in base-8, so this has no effect, but I think
it's nicer to use the 0 to remind the reader that this is not a decimal number.
Same idea as 03677889f0.
No functional change intended. The type of the iterator is generally changed to
be 'const char*' instead of 'char*'. Despite the type commonly used, modifying
the string was not allowed.
I adjusted the naming of some short variables for clarity and reduced the scope
of some variable declarations in code that was being touched anyway.
Previously the mkdir_label() family of calls was implemented in
src/shared/mkdir-label.c but its functions partly declared ins
src/shared/label.h and partly in src/basic/mkdir.h (!!). That's weird
(and wrong).
Let's clean this up, and add a proper mkdir-label.h matching the .c
file.
Compilation would fail because we could have HAVE_SMACK_RUN_LABEL without
HAVE_SMACK. This doesn't make much sense, so let's just make -Dsmack=false
completely disable smack.
Also, the logic in smack-setup.c seems dubious: '#ifdef SMACK_RUN_LABEL'
would evaluate to true even if -Dsmack-run-label='' is used. I think
this was introduced in the conversion to meson:
8b197c3a8a added
AC_ARG_WITH(smack-run-label,
AS_HELP_STRING([--with-smack-run-label=STRING],
[run systemd --system with a specific SMACK label]),
[AC_DEFINE_UNQUOTED(SMACK_RUN_LABEL, ["$withval"], [Run with a smack label])],
[])
i.e. it really was undefined if not specified. And it was same
still in 72cdb3e783 when configure.ac
was dropped.
So let's use the single conditional HAVE_SMACK_RUN_LABEL everywhere.