This has been tripping up container manager people. let's document this
explicitly.
(Note that the container interface could really use some updates, i.e.
it was written before a time where cgroup namespacing was a thing. But I
am too lazy to fix that now, so let's just add this once facet.)
Type `simple` explicitly mentions that invocation failures like a missing binary
or `User=` name won’t get detected – whereas type `exec` mentions that it does.
Type `oneshot` refers to being similar to `simple`, which could lead one to
assume it doesn’t detect such invocation failures either – it seems however it
does.
Indicate this my changing its wording to be similar to `exec`.
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Virtual block devices are a bit weird: they have no parent device, and
thus cannot be related to the subsystem they belong to, except by
pattern matching their name. This is OK to do if one knows what to look
for. However for tools that do not want to carry a list of known
subsystems with their appropriate matching patters this sucks. Let's
introduce a new ID_BLOCK_SUBSYSTEM property we can set on block devices
that carries an explicit string for this. Do so for a small number of
key subsystems: DM, loopback and zram.
So far repart always required specification of a device node. And if
none was specified, then we'd fine the node backing the root fs. Let's
optionally allow that the device node is explicitly not specified (i.e.
specified as "-" or ""), in which case we'll just print the size of the
minimal image given the definitions.
This is preparation for making this eventually available via Varlink,
where we'd like to create Context object for each call that we can free
once it is done, but not inherit state from an earlier call.
Also fixes a couple of cases where we accessed arg_node, but where we
should have accessed the Context-specific copy in .node.
The functions crypt_set_metadata_size() and friends are supported since
libcryptsetup-2.0.
This also merges checks for functions used for supporting libcryptsetup
plugins with others.
Moreover, check existence of one more function (crypt_logf) that is used in
libcryptsetup plugins.
Let's use the proper uint32_t parsers initially, so that the usual logic
of formatting integers as decimal strings, works too for uids/gids. Not
because it made any sense to encode them like that, but just to be
systematic here.
Most of our dispatch helpers already do something useful in case they
are invoked on a null JSON value: they translate this to the appropriate
niche value for the type, if there is one.
Add the same for *all* dispatchers we have, to make this fully
systematic.
For various types it's not always clear which niche value to pick. I
opted for UINT{8,16,32,64}_MAX for the various unsigned integers, which
maps our own use in most cases. I opted for -1 for the various signed
integer types. For arrays/blobs of stuff I opted for the empty
array/blob, and for booleans I opted for false.
Of course, in various cases this is not going to be the right niche
value, but that's entirely fine, after all before a json value reaches a
dispatcher function it must pass one of two type checks first:
1. Either the .type field of sd_json_dispatch_field must be
_SD_JSON_VARIANT_TYPE_INVALID to not do a type check at all
2. Or the .type field is set, but then the SD_JSON_NULLABLE flag must be
set in .flags.
This means, accidentally generating the niche values on null is not
really likely.
Let's extract common capability parsing code into a generic function
parse_capability_set() with a comprehensive set of unit tests.
We also replace usages of UINT64_MAX with CAP_MASK_UNSET where
applicable and replace the default value of CapabilityBoundingSet
with CAP_MASK_ALL which more clearly identifies that it is initialized
to all capabilities.
AI (copilot) was used to extract the generic function and write the
unit tests, with manual review and fixing afterwards to make sure
everything was correct.
I recently found out (the hard way) that on an older version
there was a bug when the verity sharing is disabled: the
deferred close flag was not set correctly, so verity devices
were leaked.
This is not an issue in main currently, but add a test case
to cover it just in case, to avoid future regressions.
When calling systemctl enable/disable/reenable --now, we'd always fail with
error when operating offline. This seemly overly restricitive. In particular,
if systemd is not running at all, the service is not running either, so
complaining that we can't stop it is completely unnecessary. But even when
operating in a chroot where systemd is not running, let's just emit a warning
and return success. It's fairly common to have installation or package scripts
which do such calls and not starting/restarting the service in those scenarios
is the desired and expected operation. (If --now is called in combination
with --global or --root=, keep returning an error.)
Also make the messages nicer. I was adding some docs to tell the user to run
'systemctl enable --now', and checked how the command can fail, and the error
message that the user might see in some common scenarios was too complicated.
Split it up to be nicer.
RootDirectory= but via a open_tree() file descriptor. This allows
setting up the execution environment for a service by the client in a
mount namespace and then starting a transient unit in that execution
environment using the new property.
We also add --root-directory= and --same-root-dir= to systemd-run to
have it run services within the given root directory. As systemd-run
might be invoked from a different mount namespace than what systemd is
running in, systemd-run opens the given path with open_tree() and then
sends it to systemd using the new RootDirectoryFileDescriptor= property.
Before aa47d8ade1, we took an exclusive lock
for the whole block device, but with the commit, a shared lock is taken.
That causes, during we requesting the kernel to reread partition table,
udev workers can process the block device or its partitions.
Let's make udev workers not process block devices during rereading
partition table again.
Follow-up for aa47d8ade1.
RootDirectory= but via a open_tree() file descriptor. This allows
setting up the execution environment for a service by the client in
a mount namespace and then starting a transient unit in that execution
environment using the new property.
We also add --root-directory= and --same-root-dir= to systemd-run to
have it run services within the given root directory. As systemd-run
might be invoked from a different mount namespace than what systemd is
running in, systemd-run opens the given path with open_tree() and then
sends it to systemd using the new RootDirectoryFileDescriptor= property.