If we have a DDI that contains only a /usr/ tree (and which is thus
combined with a tmpfs for root on boot) we previously would try to apply
idmapping to the tmpfs, but not the /usr/ mount. That's broken of
course.
Fix this by applying it to both trees.
We use it for more than just pipe() arrays. For example also for
socketpair(). Hence let's give it a generic name.
Also add EBADF_TRIPLET to mirror this for things like
stdin/stdout/stderr arrays, which we use a bunch of times.
If systemd-nspawn is newer than the running systemd, we might try to set
CoredumpReceive=yes when systemd doesn't know about it yet. Try and
check if the running systemd is aware of this setting, and if not, don't
try and use it.
Fixes 411d8c72ec
("nspawn: set CoredumpReceive=yes on container's scope when --boot is set").
When --boot is set, and --keep-unit is not, set CoredumpReceive=yes on
the scope allocated for the container. When --keep-unit is set, nspawn
does not allocate the container's unit, so the existing unit needs to
configure this setting itself.
Since systemd-nspawn@.service sets --boot and --keep-unit, add
CoredumpReceives=yes to that unit.
The naming was confused: suffix 'p' means that the function takes a pointer to
the type that the wrapped function takes. (E.g., a char**, for a wrapped function
taking a char*.) But DEFINE_TRIVIAL_DESTRUCTOR() just changes the return type.
Also add one more assert for consistency.
This adds support for the new fsmount() logic of the kernel: we'll first
create an unattached fsmount fd, and then in a second step attach this
to some real file system inode – as opposed to attaching file system
directly. The benefit of this is that we can pass the open fsmount fds
over some sockets if need be, to isolate the mounting code from the
attaching code.
No functional change. In config_parse_address_generation_type() we would set
the output parameter and then say it's ignored, so it _looked_ like an error in
the code, but the variable was always initialized to SD_ID128_NULL anyway, so
the code was actually fine.
seccomp-util.h doesn't need ifdeffing, hence don't. It has worked since
quite a while with HAVE_SECCOMP is off, hence use it everywhere.
Also drop explicit seccomp.h inclusion everywhere (which needs
HAVE_SECCOMP ifdeffery everywhere). seccomp-util.h includes it anyway,
automatically, which we can just rely on, and it deals with HAVE_SECCOMP
at one central place.
Given that ERRNO_IS_SECCOMP_FATAL() also matches positive values,
make sure this macro is not called with arguments that do not have
errno semantics.
In this case the arguments passed to ERRNO_IS_SECCOMP_FATAL() are the
values returned by external libseccomp function seccomp_load() which is
not expected to return any positive values, but let's be consistent
anyway and move ERRNO_IS_SECCOMP_FATAL() invocations to the branches
where the return values are known to be negative.
Given that ERRNO_IS_NOT_SUPPORTED() also matches positive values,
make sure this macro is not called with arguments that do not have
errno semantics.
In this case the argument passed to ERRNO_IS_NOT_SUPPORTED() is the
value returned by remount_idmap() which is not expected to return
any positive values, but let's be consistent anyway and move the
ERRNO_IS_NOT_SUPPORTED() invocation to the branch where
the return value is known to be negative.
The check added by 4c27749b8c breaks
booting an arm64 image on x86 using qemu-bin-fmt, so remove it.
Without it, the image built with mkosi --architecture=aarch64
boots fine in nspawn.
Interface=/MACVLAN=/IPVLAN= nspawn options take a _list_ of interface
names - this was recently enhanced by 2f091b1b49 to support interface
pairs. Unfortunately, this also introduced a regression where we don't
parse the list as a list, but just as a single value. For example,
having `Interface=sd-shared1 sd-shared2` in an nspawn config file would
throw:
systemd-nspawn[898]: Network interface, interface name not valid: sd-shared1 sd-shared2
systemd-nspawn[898]: /run/systemd/nspawn/testsuite-13.nspawn-settings.1po.nspawn:41: Failed to parse file: Invalid argument
Follow-up to 2f091b1b49.
Otherwise hilarity ensues:
AddressSanitizer:DEADLYSIGNAL
=================================================================
==722==ERROR: AddressSanitizer: SEGV on unknown address 0xffffffff00000000 (pc 0x7f8d50ca9ffb bp 0x7fff11b0d4a0 sp 0x7fff11b0cc30 T0)
==722==The signal is caused by a READ memory access.
#0 0x7f8d50ca9ffb in __interceptor_strcmp.part.0 (/lib64/libasan.so.8+0xa9ffb)
#1 0x7f8d4f9cf5a1 in strcmp_ptr ../src/fundamental/string-util-fundamental.h:33
#2 0x7f8d4f9cf5f8 in streq_ptr ../src/fundamental/string-util-fundamental.h:46
#3 0x7f8d4f9d74d2 in free_and_strdup ../src/basic/string-util.c:948
#4 0x49139a in free_and_strdup_warn ../src/basic/string-util.h:197
#5 0x4923eb in oci_absolute_path ../src/nspawn/nspawn-oci.c:139
#6 0x7f8d4f6bd359 in json_dispatch ../src/shared/json.c:4395
#7 0x4a8831 in oci_hooks_array ../src/nspawn/nspawn-oci.c:2089
#8 0x7f8d4f6bd359 in json_dispatch ../src/shared/json.c:4395
#9 0x4a8b56 in oci_hooks ../src/nspawn/nspawn-oci.c:2112
#10 0x7f8d4f6bd359 in json_dispatch ../src/shared/json.c:4395
#11 0x4aa298 in oci_load ../src/nspawn/nspawn-oci.c:2197
#12 0x446cec in load_oci_bundle ../src/nspawn/nspawn.c:4744
#13 0x44ffa7 in run ../src/nspawn/nspawn.c:5477
#14 0x4552fb in main ../src/nspawn/nspawn.c:5920
#15 0x7f8d4e04a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
#16 0x7f8d4e04a5c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c8)
#17 0x40d284 in _start (/usr/bin/systemd-nspawn+0x40d284)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib64/libasan.so.8+0xa9ffb) in __interceptor_strcmp.part.0
==722==ABORTING
Use the returned errno even though we are going to ignore it, otherwise
the log message is just confusing:
config.json:119:13: Failed to resolve device node 4:2, ignoring: Success
When merging the settings we take the pointer to the array of extra
devices, but don't reset the array counter to zero. This later leads to
a NULL pointer dereference, where device_node_array_free() attempts to
loop over a NULL pointer:
+ systemd-nspawn --oci-bundle=/var/lib/machines/testsuite-13.oci-bundle.Npo
../src/nspawn/nspawn-settings.c:118:29: runtime error: member access within null pointer of type 'struct DeviceNode'
#0 0x4b91ee in device_node_array_free ../src/nspawn/nspawn-settings.c:118
#1 0x4ba42a in settings_free ../src/nspawn/nspawn-settings.c:161
#2 0x410b79 in settings_freep ../src/nspawn/nspawn-settings.h:249
#3 0x446ce8 in load_oci_bundle ../src/nspawn/nspawn.c:4733
#4 0x44ff42 in run ../src/nspawn/nspawn.c:5476
#5 0x455296 in main ../src/nspawn/nspawn.c:5919
#6 0x7f0cb7a4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
#7 0x7f0cb7a4a5c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c8)
#8 0x40d284 in _start (/usr/bin/systemd-nspawn+0x40d284)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/nspawn/nspawn-settings.c:118:29 in
Also, add an appropriate assert to catch such issues in the future.
Let's make use of the new DelegateSubgroup= feature and delegate the
/supervisor/ subcgroup already to nspawn, so that moving the supervisor
process there is unnecessary.