This permits configuration of additional "delegates" which ensure that
lookups for certain DNS zones are routed to specific sets of DNS
servers, in addition to the routes we create for each network interface.
For now, this allows only static configuration, but eventually we should
open this up to IPC.
Fixes: #5573#14159#20485#21260#24532#32022
(Fixes#32022, because now redundant)
If --generate-fstab=PATH is used, there is the possibility that the
fstab file already exists, making systemd-repart fail.
This commit will add a new --append-fstab= parameter, that will read
the file and merge it with the new generated content. Using the
comments, the command can separate the automatic-generated section from
the user-provided section, allowing for the next append the replacement
only of the automatic-generated section, keeping the user one.
`ExtensionImages=` and `ExtensionDirectories=` now let you specify
vpick-named extensions; however, since they just get set up once when
the service is started, you can't see newer versions without restarting
the service entirely. Here, also reload confext extensions when you
reload a service. This allows you to deploy a new version of some
configuration and have it picked up at reload time without interruption
to your workload.
Right now, we would only reload confext extensions and leave the sysext
ones behind, since it didn't seem prudent to swap out what is likely
program code at reload. This is made possible by only going for the
`SYSTEMD_CONFEXT_HIERARCHIES` overlays (which only contains `/etc`).
This PR:
- Adjusts `service.c` to also refresh extensions when needed.
- Adds integration tests to check that a confext reload actually
occurred.
- Adds to the `systemd.exec` man pages to document this behavior.
This is a follow up to #24864 and #31364. Thank you to @bluca and
@goenkam for help in getting this up.
If --generate-fstab=PATH is used, there is the possibility that the
fstab file already exists, making systemd-repart fail.
This commit will add a new --append-fstab= parameter, that will read
the file and merge it with the new generated content. Using the
comments, the command can separate the automatic-generated section from
the user-provided section, allowing for the next append the replacement
only of the automatic-generated section, keeping the user one.
Signed-off-by: Alberto Planas <aplanas@suse.com>
Accept=yes has very valid usecases (i.e. for sporadically invoked
services) and strong benefits (i.e. better security because connections
can be sandboxed nicely, isolating them). Let's hence reword things and
stop claiming that Accept=yes was a legacy thing, because it really
isn't.
Some other man fixes, too
Our baseline is v5.4 and cgroup v2 is enforced now,
which means CPU accounting is cheap everywhere without
requiring any controller, hence just remove the directive.
This returns to the original approach proposed in
https://github.com/systemd/systemd/pull/17270. After review, the approach was
changed to use sd_pid_get_owner_uid() instead. Back then, when running in a
typical graphical session, sd_pid_get_owner_uid() would usually return the user
UID, and when running under sudo, geteuid() would return 0, so we'd trigger the
secure path.
sudo may allocate a new session if is invoked outside of a session (depending
on the PAM config). Since nowadays desktop environments usually start the user
shell through user units, the typical shell in a terminal emulator is not part
of a session, and when sudo is invoked, a new session is allocated, and
sd_pid_get_owner_uid() returns 0 too. Technically, the code still works as
documented in the man page, but in the common case, it doesn't do the expected
thing.
$ build/test-sd-login |& rg 'get_(owner_uid|cgroup|session)'
sd_pid_get_session(0) → No data available
sd_pid_get_owner_uid(0) → 1000
sd_pid_get_cgroup(0) → /user.slice/user-1000.slice/user@1000.service/app.slice/app-ghostty-transient-5088.scope/surfaces/556FAF50BA40.scope
$ sudo build/test-sd-login |& rg 'get_(owner_uid|cgroup|session)'
sd_pid_get_session(0) → c289
sd_pid_get_owner_uid(0) → 0
sd_pid_get_cgroup(0) → /user.slice/user-0.slice/session-c289.scope
I think it's worth checking for sudo because it is a common case used by users.
There obviously are other mechanims, so the man page is extended to say that
only some common mechanisms are supported, and to (again) recommend setting
SYSTEMD_LESSSECURE explicitly. The other option would be to set "secure mode"
by default. But this would create an inconvenience for users doing the right
thing, running systemctl and other tools directly, because then they can't run
privileged commands from the pager, e.g. to save the output to a file. (Or the
user would need to explicitly set SYSTEMD_LESSSECURE. One option would be to
set it always in the environment and to rely on sudo and other tools stripping
it from the environment before running privileged code. But that is also fairly
fragile and it obviously relies on the user doing a complicated setup to
support a fairly common use case. I think this decreases usability of the
system quite a bit. I don't think we should build solutions that work in
priniciple, but are painfully inconvenient in common cases.)
Fixes https://yeswehack.com/vulnerability-center/reports/346802.
Also see https://github.com/polkit-org/polkit/pull/562, which adds support for
$SUDO_UID/$SUDO_GID to pkexec.
Let's optionally issue a Varlink Synchronize() call in --follow mode
when asked to terminate. This is useful so that the tool can be called
and it is guaranteed it processed all messages generated before the
request to exit before it exits.
We want this in "systemd-run -v" in particular, so that we can be sure
we are not missing any log output from the invoked service before it
exits
Allow callers to synchronize on the point in time where the journal file
watches are fully established, in --follow mode.
Tools can invoke journalctl using this, knowing that any log message
happening after the READY=1 is definitely going to be processed by the
journalctl invocation.
Add support for a sysfail boot entry. Sysfail boot entries can be used
for optional tweaking the automatic selection order in case a failure
state of the system in some form is detected (boot firmware failure
etc).
The EFI variable `LoaderEntrySysFail` contains the sysfail boot loader
entry to use. It can be set using bootctl:
```
$ bootctl set-sysfail sysfail.conf
```
The `LoaderEntrySysFail` EFI variable would be unset automatically
during next boot by `systemd-boot-clear-sysfail.service` if no system
failure occured, otherwise it would be kept as it is and a system
failure reason will be saved to `LoaderSysFailReason` EFI variable.
`sysfail_check()` expected to be extented to support possibleconditions
when we should boot sysfail("recovery") boot entry.
Also add support for using a sysfail boot entry in case of UEFI firmware
capsule update failure [1]. The status of a firmware update is obtained
from the EFI System Resource Table (ESRT), which provides an optional
mechanism for identifying device and system firmware resources for the
purposes of targeting firmware updates to those resources.
Current implementation uses the value of LastAttemptStatus field from
ESRT, which describes the result of the last firmware update attempt for
the firmware resource entry. The field is updated each time an
`UpdateCapsule()` is attempted for an ESRT entry and is preserved across
reboots (non-volatile).
This can be be used in setups with support for A/B OTA updates, where
the boot firmware and Linux/RootFS might be updated synchronously.
The check is activated by adding "sysfail-firmware-upd" to loader.conf
[1]
https://uefi.org/specs/UEFI/2.10/23_Firmware_Update_and_Reporting.html
Follow-up for f44e7a8c11
This breaks the rule stated at the beginning of help_sudo_mode():
> NB: Let's not go overboard with short options: we try to keep a modicum of compatibility with
> sudo's short switches, hence please do not introduce new short switches unless they have a roughly
> equivalent purpose on sudo. Use long options for everything private to run0.
I've always been reluctant to invoke the current user's shell in another
user's context, hence was fully grounded in `sudo -i`. With this bit in
place `run0` will finally be feature-complete on my side ;-)
You can configure the sysfail boot entry using the bootctl command:
$ bootctl set-sysfail sysfail.conf
The value will be stored in the `LoaderEntrySysFail` EFI variable.
The `LoaderEntrySysFail` EFI variable would be unset automatically
during next boot by `systemd-boot-clear-sysfail.service` if no
system failure occured, otherwise it would be kept as it is and a system
failure reason will be saved to `LoaderSysFailReason` EFI variable.
Signed-off-by: Igor Opaniuk <igor.opaniuk@foundries.io>
Add support for a sysfail boot entry. Sysfail boot entries can be
used for optional tweaking the automatic selection order in case a
failure state of the system in some form is detected (boot firmware
failure etc).
The EFI variable `LoaderEntrySysFail` holds the boot loader entry to
be used in the event of a system failure. If a failure occurs, the reason
will be stored in the `LoaderSysFailReason` EFI variable.
sysfail_check() expected to be extented to support possible
conditions when we should boot sysfail("recovery") boot entry.
Signed-off-by: Igor Opaniuk <igor.opaniuk@foundries.io>
The bless-boot logic currently assumes that if the name of the boot
entry reported via the EFI var matches the name on disk that the state
is "indeterminate", as we haven't counted down the counter (to mark it
bad) or drop the counter (to mark it good) yet. But there's one corner
case we so far didn't care about: what if the entry already reached 0
left tries in a previous boot, i.e. if the user invoked an entry already
known to be completely bad. In that case we'd still return
"indeterminate", but that's kinda misleading, because we *know* the
currently booted entry is bad, however we inherited that fact from a
previous boot, we didn't determine it on the current.
hence, let's introduce a new status we report in this case, that is both
distinct from "bad" (which indicates whether the *current* boot is bad)
and "indirect" (which indicates the current boot has not been decided on
yet): "dirty".
Why "dirty"? To mirror "clean" which we already have, which indicates a
boot already marked good in a previous boot, which is a relatively
symmetric state.
This is a really weak api break of sorts, because it introduces a new
state we never reported before, but I think it's fine, because the old
reporting was just wrong, and in a way this is bugfix, that we now
report correctly something where previously returned kind of rubbish
(though systematic rubbish).
Replaces: #37350
$PAGER wasn't documented, but actually we treat it same as $SYSTEMD_PAGER,
except for lower priority. And the two variables can be used to disable the
pager, even if $SYSTEMD_PAGERSECURE is not set.
Behaviour is (obviously) not changed by this patch, it intentionally just
updates the docs to match the code.
The existing description was not *wrong*, but it was a bit muddled. Let's
reorder the text to give a short intro and then describe what the options
actually do and the clear "true" and "false" cases first, and then describe
autodetection.
Related to https://yeswehack.com/vulnerability-center/reports/346802.
This PR provides a new option for systemd-boot
`secure-boot-enroll-action` which allows to configure the behavior after
SecureBoot keys are enrolled.
Provides the option to either reboot or power off.
The current behavior is not changed, it will by default reboot as it did
before.
It also provides a small message about the action its going to take with
a small delay so the user can read it.
When switching to another user it's oftentimes desirable to also spawn
the target user's shell. sudo supports this via -i flag, run0 currently
doesn't. We don't want to proactively query NSS ourselves, since
that would fall short when operating remotely. Let's instead teach
the service manager to spawn the command using the user's default shell.
I opted for "|" instead of "." in the end because the latter seems
a bit obscure. But happy to change it to something else if a better option
comes up.
The existing text grew organically as features were added and was
not very organized. Reorder it and break into paragraphs grouped
by topic. The description of the :errno syntax is replaced by a short
reference to the SystemCallErrorNumber= setting. This makes the
text shorter and makes it easier to explain how the two settings combine.