When Type=notify-reload got introduced, it wasn't intended to be
mutually exclusive with ExecReload=. However, currently ExecReload=
is immediately forked off after the service main process is signaled,
leaving states in between essentially undefined. Given so broken
it is I doubt any sane user is using this setup, hence I took a stab
to rework everything:
1. Extensions are refreshed (unchanged)
2. ExecReload= is forked off without signaling the process
3a. If RELOADING=1 is sent during the ExecReload= invocation,
we'd refrain from signaling the process again, instead
just transition to SERVICE_RELOAD_NOTIFY directly and
wait for READY=1
3b. If not, signal the process after ExecReload= finishes
(from now on the same as Type=notify-reload w/o ExecReload=)
4. To accomodate the use case of performing post-reload tasks,
ExecReloadPost= is introduced which executes after READY=1
The new model greatly simplifies things, as no control processes
will be around in SERVICE_RELOAD_SIGNAL and SERVICE_RELOAD_NOTIFY
states.
See also: https://github.com/systemd/systemd/issues/37515#issuecomment-2891229652
Type `simple` explicitly mentions that invocation failures like a missing binary
or `User=` name won’t get detected – whereas type `exec` mentions that it does.
Type `oneshot` refers to being similar to `simple`, which could lead one to
assume it doesn’t detect such invocation failures either – it seems however it
does.
Indicate this my changing its wording to be similar to `exec`.
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
When switching to another user it's oftentimes desirable to also spawn
the target user's shell. sudo supports this via -i flag, run0 currently
doesn't. We don't want to proactively query NSS ourselves, since
that would fall short when operating remotely. Let's instead teach
the service manager to spawn the command using the user's default shell.
I opted for "|" instead of "." in the end because the latter seems
a bit obscure. But happy to change it to something else if a better option
comes up.
Let's bump the kernel baseline a bit to 4.3 and thus require ambient
caps.
This allows us to remove support for a variety of special casing, most
importantly the ExecStart=!! hack.
Processes can easily survive the first kill operation we execute, hence
we shouldn't make strong claims about them having exited already. Let's
just say "likely" hence.
Fixes: #15032
The goal of RestartMode=direct is to make restarts invisible
to dependents, so auto restart jobs shouldn't bring them down
at all. So far we only skipped going through failed/dead states
in service_enter_dead(), i.e. the unit would never be considered
dead. But when constructing restart transaction, the stop job
would be propagated to dependents. Consider the following 2 units:
dependent.target:
[Unit]
BindsTo=a.service
After=a.service
a.service:
[Service]
ExecStart=bash -c 'sleep 100 && exit 1'
Restart=on-failure
RestartMode=direct
Before this commit, even though BindsTo= isn't triggered since
a.service never failed, when a.service auto-restarts, dependent.target
is also restarted. Let's suppress it by using JOB_REPLACE instead of
JOB_RESTART_DEPENDENCIES in service_enter_restart().
Fixes#34758
The example above is subtly different from the original report,
to illustrate that the new behavior makes sense for less exotic
use cases too.
The documentation claimed that ExecStartPre=/ExecStartPost= accepts
multiple command lines, in contrast to ExecStart=. This is half an
untruth, because ExecStart= allows that too – as long as Type=oneshot is
set.
Hence, reword this a bit, and do not emphasize the contrast.
Prompted by: #34570
So far we supported this syntax:
ExecStart=foo ; bar
as equivalent to:
ExecStart=foo
ExecStart=bar
With this change we'll "soft" deprecate the first syntax. i.e. it's
still supported in code, but not documented anymore.
The concept was originally added to make things easier for 3rd party
.ini readers, as it allowed writing unit files with a .ini framework
that doesn't allow multiple assignments for the same key. But frankly,
this is kinda pointless, as so many other of our knobs require the
double assignment.
Hence, let's just stop advertising the concept, let's simplify the docs,
by removing one entirely redundant feature from it.
Replaces: #34570
One of the major pait points of managing fleets of headless nodes is
that when something fails at startup, unless debug level was already
enabled (which usually isn't, as it's a firehose), one needs to manually
enable it and pray the issue can be reproduced, which often is really
hard and time consuming, just to get extra info. Usually the extra log
messages are enough to triage an issue.
This new option makes it so that when a service fails and is restarted
due to Restart=, log level for that unit is set to debug, so that all
setup code in pid1 and sd-executor logs at debug level, and also a new
DEBUG_INVOCATION=1 env var is passed to the service itself, so that it
knows it should start with a higher log level. Once the unit succeeds
or reaches the rate limit the original level is restored.
When credentials are used with Type=simple + ExecStartPost=,
i.e. when multiple sd-executor instances are running in parallel
for a single service, the state of final credential dir
might be unexpected wrt path_is_mount_point() and other
steps. So, let's imply Type=exec if not explicitly specified,
and emit a warning otherwise.
Make the warning for oneshot services (where RuntimeMaxSec= has no
effect) more actionable by pointing to the directive people can use
instead to effectively limit their runtime.
We are saying in public that the protocl is stable and can be easily
reimplemented, so provide an example doing so in the documentation,
license as MIT-0 so that it can be copied and pasted at will.
I think this was just overlooked in #13754, which removed
the restriction of Restart= on Type=oneshot services.
There's no reason to prevent RestartForceExitStatus=
now that Restart= has been allowed.
Closes#31148
The documentation for `RestartPreventExitStatus=` differs from that for `SuccessExitStatus=` in ways that are sometimes confusing (e.g. using `numeric exit codes` instead of `numeric termination statuses`), and other times plain incorrect (e.g. not mentioning `termination status names`, which I've just confirmed to work in systemd 255).
This patch modifies the documentation to be as similar as possible, so as to reduce the reader's cognitive load.
The only reason to recommend this would be if people had multiple commands
with the same name in the search path. This probably was never the best idea,
and it happens rarely anyway. Since the patch that dropped requirement for full
paths was introduced, we have dropped support for unmerged-usr and we're planning
to drop support for split-bin at some point too. Many people effectively have just
one directory in the search path, so there is even less reason to use an absolute
path. So let's recommend just using the command name, which makes the unit file
shorter and nicer to read.
This tries to add information about when each option was added. It goes
back to version 183.
The version info is included from a separate file to allow generating it,
which would allow more control on the formatting of the final output.
This is a follow-up for #28596.
I think the suggestion to use Type=exec uses too strong wording:
Type=exec has non-trivial drawbacks over Type=simple, and they deserve
to be mentioned.
Hence drop the <emphasis> and turn this around so that Type=exec is
*recommended*, but Type=simple is not expressly discouraged, because
there are plenty reasons to use it.
Add a brief discussion where Type=simple might be preferable.
Also, fix the outright unruth that Type=exec was the "simplest and
fastest", because it certainly is a lot, but not that.
The descriptions of various options are reworked: first say what protocol
actually is, i.e. describe what type of notification the manager waits
for. Only after that describe various steps and things the service should
do. Also, apply some paragraph breaks.
Instead of recommending Type=simple, recommend Type=exec. Say explicitly that
Type=simple, Type=forking are not recommended. Type=simple ignores failure in a
way that doesn't make any sense except as a historical accident. We introduced
'exec' instead of changing 'simple' to keep backwards-compatiblity, but
'simple' is not very useful. 'forking' works, but is inefficient: correctly
programming the interface requires a lot of work, and at runtime, the
additional one or two forks are just a waste of CPU resources. Furthermore, we
now understand that because of COW traps, they may also increase memory
requirements. There is really no reason to use 'forking', except if it's
already implemented and the code cannot be changed to use 'notify'.
Also, remove the recommendations to use Type=simple to avoid delaying boot. In
most cases, if the service can support notifications about startup, those
should be done.
Overall, for new services, "notify", "notify-reload", and "dbus" are the
types that make sense.
When this option is set to direct, the service restarts without entering a failed
state. Dependent units are not notified of transitory failure.
This is useful for the following use case:
We have a target with Requires=my-service, After=my-service.
my-service.service is a oneshot service and has Restart=on-failure in
its definition.
my-service.service can get stuck for various reasons and time out, in
which case it is restarted. Currently, when it fails the first time, the
target fails, even though my-service is restarted.
The behavior we're looking for is that until my-service is not restarted
anymore, the target stays pending waiting for my-service.service to
start successfully or fail without being restarted anymore.
Doesn't really matter since the two unicode symbols are supposedly
equivalent, but let's better follow the unicode recommendations to
prefer greek small letter mu, as per:
https://www.unicode.org/reports/tr25
Oftentimes it is useful to allow the per-service fd store to survive
longer than for a restart. This is useful in various scenarios:
1. An fd to some security relevant object needs to be stashed somewhere,
that should not be cleaned automatically, because the security
enforcement would be dropped then.
2. A user namespace fd should be allocated on first invocation and be
kept around until the user logs out (i.e. systemd --user ends), á la
#16328 (This does not implement what #16318 asks for, but should
solve the use-case discussed there.)
3. There's interest in allow a concept of "userspace reboots" where the
kernel stays running, and userspace is swapped out (i.e. all services
exit, and the rootfs transitioned into a new version of it) while
keeping some select resources pinned, very similar to how we
implement a switch root. Thus it is useful to allow services to exit,
while leaving their fds around till the very end.
This is exposed through a new FileDescriptorStorePreserve= setting that
is closely modelled after RuntimeDirectoryPreserve= (in fact it reused
the same internal type), since we want similar behaviour in the end, and
quite often they probably want to be used together.
The description was split — part was under ExecStart= and part in "Command lines".
Now the whole generic part is moved to the separate section, and under ExecStart=
only the stuff that is specific to that option is described.
This just moves the text and removes some repetitions.