Commit Graph

1331 Commits

Author SHA1 Message Date
Yu Watanabe
4788f635e3 Merge pull request #26203 from medhefgo/meson
meson: Use dicts for test/fuzzer definitions
2023-02-22 10:27:16 +09:00
Jan Janssen
2ed35b2f3e meson: Use dicts for fuzzer definitions 2023-02-21 15:10:26 +01:00
Jan Janssen
822cd3ff25 meson: Use dicts for test definitions
Although this slightly more verbose it makes it much easier to reason
about. The code that produces the tests heavily benefits from this.

Test lists are also now sorted by test name.
2023-02-21 15:10:26 +01:00
Yu Watanabe
0c2aedb451 tree-wide: use FORK_REARRANGE_STDIO and FORK_CLOSE_ALL_FDS 2023-02-21 07:39:18 +09:00
Dmitry V. Levin
30fd9a2dab treewide: fix a few typos in NEWS, docs and comments 2023-02-15 10:41:03 +00:00
Dmitry V. Levin
8d3473f01d src: fix several typos in log messages 2023-02-15 10:41:03 +00:00
ml
7b03b44ed9 nspawn: fix directory in logged error 2023-02-12 00:22:52 +01:00
Yu Watanabe
f3f2d02e97 tree-wide: set FORK_RLIMIT_NOFILE_SAFE flag
No functional changes, just refactoring.
2023-02-07 14:39:49 +09:00
Lennart Poettering
74e795ee55 id128: introduce ERRNO_IS_MACHINE_ID_UNSET() helper macro 2023-02-01 15:25:30 +01:00
Daan De Meyer
0a67965fa2 nspawn: Make sure we create bind mount points as the correct UID/GID
When using --private-users, we have to create bind mount points as
the user that will become root in the user namespace, so let's take
that into account.
2023-01-29 08:59:19 +01:00
Daan De Meyer
2642d22adc nspawn: Drop CAP_NET_BIND_SERVICE when in userns but not in netns
If we're in a user namespace but not unsharing the network namespace,
we won't be able to bind any privileged ports even with
CAP_NET_BIND_SERVICE, so let's drop it from the retained capabilities
so services can condition themselves on that.
2023-01-26 22:18:47 +01:00
Jan Janssen
4a7ee0a521 meson: Do not include headers in source lists
Meson+ninja+compiler do this for us and are better at it.

https://mesonbuild.com/FAQ.html#do-i-need-to-add-my-headers-to-the-sources-list-like-in-autotools
2023-01-24 22:04:03 +01:00
Lennart Poettering
162f6477c6 path-util: rework file_in_same_dir() on top of path_extract_directory()
Let's port one more over.

Note that this changes behaviour of file_in_same_dir() in some regards.
Specifically, a trailing slash of the input path will be treated
differently: previously we'd operate below that dir then, instead of the
parent. I think that makes little sense however, and I think the code
using this function doesn't expect that either.

Moroever, addresses some corner cases if the path is specified as "/" or
".", i.e. where e cannot extract a parent. These will now be treated as
error, which I think is much cleaner.
2023-01-24 18:13:27 +01:00
Lennart Poettering
22ee78a898 loop-util: always tell kernel explicitly about loopback sector size
Let's not leave the sector size unspecified: either set a user supplied
value, or auto-detect the right size by probing the disk image
accordingly.
2023-01-18 10:47:17 +01:00
Lennart Poettering
34680637e8 nspawn: guard acl_free() with a NULL check
Inspired by #25957 there's one other place where we don't guard
acl_free() calls with a NULL check.

Fix that.
2023-01-06 14:59:09 +01:00
Lennart Poettering
b36e39d2eb nspawn: port over basename() → path_extract_filename() 2022-12-23 15:04:19 +01:00
Yu Watanabe
5bb1d7fbab tree-wide: use -EBADF more 2022-12-21 01:50:33 +09:00
Yu Watanabe
19ee48a6c2 tree-wide: introduce PIPE_EBADF macro 2022-12-20 11:12:58 +09:00
Zbigniew Jędrzejewski-Szmek
254d1313ae tree-wide: use -EBADF for fd initialization
-1 was used everywhere, but -EBADF or -EBADFD started being used in various
places. Let's make things consistent in the new style.

Note that there are two candidates:
EBADF 9 Bad file descriptor
EBADFD 77 File descriptor in bad state

Since we're initializating the fd, we're just assigning a value that means
"no fd yet", so it's just a bad file descriptor, and the first errno fits
better. If instead we had a valid file descriptor that became invalid because
of some operation or state change, the other errno would fit better.

In some places, initialization is dropped if unnecessary.
2022-12-19 15:00:57 +01:00
Yu Watanabe
9d50f8508b mount-util: make mount_switch_root() take a mount propagation flag 2022-12-15 14:17:22 +09:00
Yu Watanabe
a6e16d949c Merge pull request #25723 from keszybz/generators-tmp
Run generators with / ro and /tmp mounted
2022-12-15 12:53:49 +09:00
Zbigniew Jędrzejewski-Szmek
9f563f2792 tree-wide: use mode=0nnn for mount option
This is an octal number. We used the 0 prefix in some places inconsistently.
The kernel always interprets in base-8, so this has no effect, but I think
it's nicer to use the 0 to remind the reader that this is not a decimal number.
2022-12-14 22:12:44 +01:00
Christian Brauner
fefb7a6def nspawn: remove cgroup socket
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 16:03:30 +01:00
Christian Brauner
bb1aa18569 nspawn: remove pty socket
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 16:03:30 +01:00
Christian Brauner
b07ee9035e nspawn: remove rtnl socket
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 16:03:30 +01:00
Christian Brauner
5d9d3fcb18 nspawn: s/kmsg_socket_pair/fd_inner_socket_pair/g
Also stop stashing the kmsg fifo fd in the socket. Just retrieve it in
the parent and have the parent hold on to it.

Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 15:25:46 +01:00
Christian Brauner
af06cd3024 nspawn: s/fd_socket_pair/fd_outer_socket_pair/g
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 15:25:46 +01:00
Christian Brauner
525f4e59db nspawn: remove uid socket
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 15:25:46 +01:00
Christian Brauner
1823d92d7b nspawn: remove uuid socket
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 15:25:45 +01:00
Christian Brauner
b1e1d1fa48 nspawn: remove pid socket
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 15:25:45 +01:00
Christian Brauner
cc44af4f59 nspawn: s/notify_socket/fd_socket/g
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-13 15:25:45 +01:00
Zbigniew Jędrzejewski-Szmek
d70eaf3067 nspawn: realign columns
Follow-up for b9e7f22c2d.
2022-12-13 11:36:11 +01:00
Yu Watanabe
b40c8ebdc8 sd-id128: fold do_sync flag into Id128FormatFlag 2022-12-12 22:07:48 +09:00
Yu Watanabe
057bf780e9 sd-id128: make id128_read() or friends return -ENOPKG when the file contents is "uninitialized"
Then, this drops ID128_PLAIN_OR_UNINIT. Also, this renames
Id128Format -> Id128FormatFlag, and make it bitfield.

Fixes #25634.
2022-12-12 21:57:31 +09:00
Luca Boccassi
9cd4881d47 Merge pull request #25513 from brauner/pivot_root.nspawn
nspawn: support pivot_root()
2022-12-06 01:51:51 +01:00
Christian Brauner
e79581ddfe nspawn: split mount tunnel setup
Before we supported pivot_root() nspawn used to make the rootfs shared
before setting up the mount tunnel. So it was safe for it to just turn
it into a dependent mount during setup.

However, we cannot do this anymore because of the requirements
pivot_root() has. After the pivot_root() we will make the rootfs shared
recursively. If we turned the mount tunnel into dependent mount before
mount_switch_root() this will have the consequence that it becomes a
shared mount within the same peer group as the rootfs. So no mounts will
propagate into the container from the host anymore.

To fix this we split setting up the mount tunnel and making it active
into two steps. Setting up the mount tunnel is performed before
mount_switch_root() and activating it afterwards. Note that this works
because turning a shared mount into a shared mount is a nop. IOW, no new
peer group will be allocated.

Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-05 18:35:02 +01:00
Christian Brauner
b71a0192c0 nspawn: mount temporary visible procfs and sysfs instance
In order to mount procfs and sysfs in an unprivileged container the
kernel requires that a fully visible instance is already present in the
target mount namespace. Mount one here so the inner child can mount its
own  instances. Later we umount the temporary  instances created here
before we actually exec the payload. Since the rootfs is shared the
umount will propagate into the container. Note, the inner child wouldn't
be able to unmount the  instances on its own since it doesn't own the
originating mount namespace. IOW, the outer child needs to do this.

So far nspawn didn't run into this issue because it used MS_MOVE which
meant that the shadow mount tree pinned a procfs and sysfs instance
which the kernel would find. The shadow mount tree is gone with proper
pivot_root() semantics.

Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-05 18:34:25 +01:00
Christian Brauner
57c10a5650 nspawn: support pivot_root()
In order to support pivot_root() we need to move mount propagation
changes after the pivot_root(). While MS_MOVE requires the source mount
to not be a shared mount pivot_root() also requires the target mount to
not be a shared mount. This guarantees that pivot_root() doesn't leak
any mounts.

Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-12-05 18:34:25 +01:00
Yu Watanabe
6c2d70ce9f tree-wide: fix typo 2022-12-02 13:27:08 +09:00
Phaedrus Leeds
c85c2f7930 nspawn: Use "Ctrl-" rather than "^" in info msg
Maybe most people know that "^]" means "Ctrl + ]" but for those that
don't, this should be more clear.
2022-12-02 08:28:04 +09:00
Lennart Poettering
73d88b806b dissect: rework DISSECT_IMAGE_ADD_PARTITION_DEVICES + DISSECT_IMAGE_OPEN_PARTITION_DEVICES
Curently, these two flags were implied by dissect_loop_device(), but
that's not right, because this means systemd-gpt-auto-generator will
dissect the root block device with these flags set and that's not
desirable: the generator should not cause the partition devices to be
created (we don't intend to use them right-away after all, but expect
udev to find/probe them first, and then mount them though .mount units).
And there's no point in opening the partition devices, since we do not
intend to mount them via fds either.

Hence, rework this: instead of implying the flags, specify them
explicitly.

While we are at it, let's also rename the flags to make them more
descriptive:

DISSECT_IMAGE_MANAGE_PARTITION_DEVICES becomes
DISSECT_IMAGE_ADD_PARTITION_DEVICES, since that's really all this does:
add the partition devices via BLKPG.

DISSECT_IMAGE_OPEN_PARTITION_DEVICES becomes
DISSECT_IMAGE_PIN_PARTITION_DEVICES, since we not only open the devices,
but keep the devices open continously (i.e. we "pin" them).

Also, drop the DISSECT_IMAGE_BLOCK_DEVICE combination flag, since it is
misleading, i.e. it suggests it was appropriate to specify on all
dissected blocking devices, but that's precisely not the case, see the
systemd-gpt-auto-generator case. My guess is that the confusion around
this was actually the cause for this bug we are addressing here.

Fixes: #25528
2022-12-01 11:32:30 +01:00
Luca Boccassi
a0c544ee09 Merge pull request #25379 from keszybz/update-doc-links
Update doc links
2022-11-22 01:07:13 +01:00
Zbigniew Jędrzejewski-Szmek
db81144428 tree-wide: BLS and DPS are now on uapi-group website 2022-11-21 12:26:35 +01:00
Sam James
b9e7f22c2d nspawn: allow sched_rr_get_interval_time64 through seccomp filter
We only allow a selected subset of syscalls from nspawn containers
and don't list any time64 variants (needed for 32-bit arches when
built using TIME_BITS=64, which is relatively new).

We allow sched_rr_get_interval which cpython's test suite makes
use of, but we don't allow sched_rr_get_interval_time64.

The test failures when run in an arm32 nspawn container on an arm64 host
were as follows:
```
======================================================================
ERROR: test_sched_rr_get_interval (test.test_posix.PosixTester.test_sched_rr_get_interval)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/tmp/portage/dev-lang/python-3.11.0_p1/work/Python-3.11.0/Lib/test/test_posix.py", line 1180, in test_sched_rr_get_interval
    interval = posix.sched_rr_get_interval(0)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 1] Operation not permitted
```

Then strace showed:
```
sched_rr_get_interval_time64(0, 0xffbbd4a0) = -1 EPERM (Operation not permitted)
```

This appears to be the only time64 syscall that isn't already included one of
the sets listed in nspawn-seccomp.c that has a non-time64 variant. Checked
over each of the time64 syscalls known to systemd and verified that none
of the others had a non-time64-variant whitelisted in nspawn other than
sched_rr_get_interval.

Bug: https://bugs.gentoo.org/880131
2022-11-18 16:32:17 +01:00
Daan De Meyer
12e2b70f9b nulstr-util: Declare NULSTR_FOREACH() iterator inline 2022-11-11 16:31:32 +01:00
Zbigniew Jędrzejewski-Szmek
28db6fbff1 Rename def.h to constants.h
The name "def.h" originates from before the rule of "no needless abbreviations"
was established. Let's rename the file to clarify that it contains a collection
of various semi-related constants.
2022-11-08 18:21:10 +01:00
Zbigniew Jędrzejewski-Szmek
3ae6b3bf72 basic: rename util.h to logarithm.h
util.h is now about logarithms only, so we can rename it. Many files included
util.h for no apparent reason… Those includes are dropped.
2022-11-08 18:21:10 +01:00
Zbigniew Jędrzejewski-Szmek
ee617a4e5c basic: move a bunch of cmdline-related funcs to new argv-util.c+h
I wanted to move saved_arg[cv] to process-util.c+h, but this causes problems:
process-util.h includes format-util.h which includes net/if.h, which conflicts
with linux/if.h. So we can't include process-util.h in some files.

But process-util.c is very long anyway, so it seems nice to create a new file.
rename_process(), invoked_as(), invoked_by_systemd(), and argv_looks_like_help()
which lived in process-util.c refer to saved_argc and saved_argv, so it seems
reasonable to move them to the new file too.

util.c is now empty, so it is removed. util.h remains.
2022-11-08 18:21:10 +01:00
Zbigniew Jędrzejewski-Szmek
d6b4d1c7c4 basic: move version() to build.h+c 2022-11-08 13:41:14 +01:00
Christian Brauner
f7a2dc3dd5 nspawn: use in_same_namespace() helper 2022-10-04 18:51:30 +02:00