Commit Graph

1430 Commits

Author SHA1 Message Date
Matteo Croce
e22ad53d5c dbus-wait-for-jobs: change 'quiet' flag to enum
Change the 'quiet' flag to `bus_wait_for_jobs()` to an enum, so we can
select with more granularity the type of information logged.
2023-12-19 04:52:41 -08:00
Lennart Poettering
e22ca70008 ether-addr-util: split out logic to mark MAC addresses as random 2023-12-19 11:48:05 +09:00
Mike Yuan
bd546b9b48 machine-credential: introduce MachineCredentialContext
This allows more straightforward memory management and
the use of static destructor.

Requested (by me) in https://github.com/systemd/systemd/pull/30143#discussion_r1401980763
2023-12-14 08:50:44 +00:00
Lennart Poettering
21c43631d7 rlimit-util: add pid_getrlimit() helper
This is gets the resource limits off a specified process, and is very
similar to prlimit() with a NULL new_rlimit argument. In fact, it tries
that first. However, it then falls back to use /proc/$PID/limits. Why?
Simply because Linux prohibits access to prlimit() for processes with a
different UID, but /proc/$PID/limits still works.

This is preparation to allow nspawn to run unprivileged.
2023-12-14 08:31:29 +00:00
Lennart Poettering
2c70a81de6 cgroup: bring list of delegated cgroup attributes up-to-date with current kernels
THis brings the list of attributes to delegate to managers of subcgroups
to the state of kernel 6.6.

We probably should unify this list, and maybe generate it automatically
from /sys/kernel/cgroup/delegate, but let's do that another time.
2023-12-13 09:58:45 +01:00
Lennart Poettering
af255804b2 nspawn: drop redundant assignments 2023-12-12 12:06:34 +01:00
Lennart Poettering
1f87cc8cd9 nspawn: suffix some paths in log messages with /, as per coding style 2023-12-12 12:06:21 +01:00
Daan De Meyer
dd78141c53 nspawn: Check later whether to keep/drop CAP_NET_BIND_SERVICE
Currently the check doesn't take any settings from nspawn settings
files into account, so let's delay the check until after we've
loaded any settings file.
2023-12-06 23:07:26 +00:00
Lennart Poettering
6498a0c2cc user-util: add new helper fully_set_uid_gid()
Usually when we do setresuid() we also do setesgid() and setgroups().
Let's add a common helper that does all three, and use it everywhere.
2023-12-06 22:11:38 +01:00
Zbigniew Jędrzejewski-Szmek
e6c5386dee core: turn on higher optimization level in seccomp
This mirrors what d75615f398 did for nspawn.

It isn't really a fatal failure if we can't set that, so ignore it in libseccomp
cannot set the attribute.

 line  OP   JT   JF   K
=================================
 0000: 0x20 0x00 0x00 0x00000004   ld  $data[4]
 0001: 0x15 0x00 0xb7 0x40000003   jeq 1073741827 true:0002 false:0185
 0002: 0x20 0x00 0x00 0x00000000   ld  $data[0]
 0003: 0x15 0xb5 0x00 0x00000000   jeq 0    true:0185 false:0004
 0004: 0x15 0xb4 0x00 0x00000001   jeq 1    true:0185 false:0005
 0005: 0x15 0xb3 0x00 0x00000002   jeq 2    true:0185 false:0006
 0006: 0x15 0xb2 0x00 0x00000003   jeq 3    true:0185 false:0007
 0007: 0x15 0xb1 0x00 0x00000004   jeq 4    true:0185 false:0008
 0008: 0x15 0xb0 0x00 0x00000005   jeq 5    true:0185 false:0009
 0009: 0x15 0xaf 0x00 0x00000006   jeq 6    true:0185 false:0010
 ...
 0438: 0x15 0x03 0x00 0x000001be   jeq 446  true:0442 false:0439
 0439: 0x15 0x02 0x00 0x000001bf   jeq 447  true:0442 false:0440
 0440: 0x15 0x01 0x00 0x000001c0   jeq 448  true:0442 false:0441
 0441: 0x06 0x00 0x00 0x00050026   ret ERRNO(38)
 0442: 0x06 0x00 0x00 0x7fff0000   ret ALLOW

 line  OP   JT   JF   K
=================================
 0000: 0x20 0x00 0x00 0x00000004   ld  $data[4]
 0001: 0x15 0x00 0x27 0x40000003   jeq 1073741827 true:0002 false:0041
 0002: 0x20 0x00 0x00 0x00000000   ld  $data[0]
 0003: 0x25 0x01 0x00 0x000000b5   jgt 181  true:0005 false:0004
 0004: 0x05 0x00 0x00 0x00000143   jmp 0328
 0005: 0x25 0x00 0xa1 0x00000139   jgt 313  true:0006 false:0167
 0006: 0x25 0x00 0x51 0x00000179   jgt 377  true:0007 false:0088
 0007: 0x25 0x00 0x29 0x000001a0   jgt 416  true:0008 false:0049
 0008: 0x25 0x00 0x13 0x000001b0   jgt 432  true:0009 false:0028
 0009: 0x25 0x00 0x09 0x000001b8   jgt 440  true:0010 false:0019
 ...
 0551: 0x15 0x03 0x00 0x00000002   jeq 2    true:0555 false:0552
 0552: 0x15 0x02 0x01 0x00000001   jeq 1    true:0555 false:0554
 0553: 0x15 0x01 0x00 0x00000000   jeq 0    true:0555 false:0554
 0554: 0x06 0x00 0x00 0x00050026   ret ERRNO(38)
 0555: 0x06 0x00 0x00 0x7fff0000   ret ALLOW

The program is longer but hopefully faster because of the binary search.
2023-12-02 01:21:53 +01:00
Mike Yuan
0cdffada3d nspawn-patch-uid: clarify that changing mode of symlink is unsupported 2023-11-25 19:12:20 +08:00
Mike Yuan
677e644530 Revert "nspawn-patch-uid: try fchmodat2() to restore mode of symlink"
This reverts commit 30462563b1.

fchmodat2(), while accepting AT_SYMLINK_NOFOLLOW as a valid flag,
always returns EOPNOTSUPP when operating on a symlink. The Linux kernel
simply doesn't support changing the mode of a symlink.

Fixes #30157
2023-11-25 19:12:15 +08:00
Lennart Poettering
6045958bab machine-credential: fix error logging
Remove duplicate logging: let exclusively
machine_credential_load()/machine_credential_set() log, and not the
caller again.
2023-11-22 15:16:32 +01:00
Yu Watanabe
965040d811 test: always call test_setup_logging() 2023-11-18 03:04:27 +09:00
Frantisek Sumsal
d4317fe172 nspawn: allow disabling os-release check
Introduce a new env variable $SYSTEMD_NSPAWN_CHECK_OS_RELEASE, that can
be used to disable the os-release check for bootable OS trees. Useful
when trying to boot a container with empty /etc/ and bind-mounted /usr/.

Resolves: #29185
2023-11-03 16:05:14 +00:00
Luca Boccassi
1af46aecf5 Merge pull request #29508 from CodethinkLabs/systemd-vmspawn-pr
systemd-vmspawn implementation that only supports disk images
2023-11-03 16:04:38 +00:00
Lennart Poettering
41de458aed nspawn: fix two failure paths
We need to go to "finish" rather than just return.

All our exit paths got this right, except two.
2023-11-03 14:39:46 +01:00
Sam Leonard
e8ac916ec3 nspawn: moved nspawn-creds.[ch] to shared/machine-credential.[ch]
This was done in order to allow sharing of this code between
systemd-nspawn and systemd-vmspawn.
2023-11-02 15:37:40 +00:00
Lennart Poettering
e9ccae3135 process-util: add new FORK_DEATHSIG_SIGKILL flag, rename FORK_DEATHSIG → FORK_DEATHSIG_SIGTERM
Sometimes it makes sense to hard kill a client if we die. Let's hence
add a third FORK_DEATHSIG flag for this purpose: FORK_DEATHSIG_SIGKILL.

To make things less confusing this also renames FORK_DEATHSIG to
FORK_DEATHSIG_SIGTERM to make clear it sends SIGTERM. We already had
FORK_DEATHSIG_SIGINT, hence this makes things nicely symmetric.

A bunch of users are switched over for FORK_DEATHSIG_SIGKILL where we
know it's safe to abort things abruptly. This should make some kernel
cases more robust, since we cannot get confused by signal masks or such.

While we are at it, also fix a bunch of bugs where we didn't take
FORK_DEATHSIG_SIGINT into account in safe_fork()
2023-11-02 14:09:23 +01:00
Lennart Poettering
f1b622a00c varlink,json: introduce new varlink_dispatch() helper
varlink_dispatch() is a simple wrapper around json_dispatch() that
returns clean, standards-compliant InvalidParameter error back to
clients, if the specified JSON cannot be parsed properly.

For this json_dispatch() is extended to return the offending field's
name. Because it already has quite a few parameters, I then renamed
json_dispatch() to json_dispatch_full() and made json_dispatch() a
wrapper around it that passes the new argument as NULL. While doing so I
figured we should also get rid of the bad= argument in the short
wrapper, since it's only used in the OCI code.

To simplify the OCI code this adds a second wrapper oci_dispatch()
around json_dispatch_full(), that fills in bad= the way we want.

Net result: instead of one json_dispatch() call there are now:

1. json_dispatch_full() for the fully feature mother of all dispathers.
2. json_dispatch() for the simpler version that you want to use most of
   the time.
3. varlink_dispatch() that generates nice Varlink errors
4. oci_dispatch() that does the OCI specific error handling

And that's all there is.
2023-11-02 01:19:21 +00:00
Arseny Maslennikov
30462563b1 nspawn-patch-uid: try fchmodat2() to restore mode of symlink
Prior to this commit, if the target had been a symlink, we did nothing
with it. Let's try with fchmodat2() and skip gracefully if not supported.

Co-authored-by: Mike Yuan <me@yhndnzj.com>
2023-11-02 00:29:09 +08:00
Lennart Poettering
5c2597ab07 Merge pull request #29788 from poettering/nspawn-barrier-fix
nspawn: fix barriers when wiping fully visible procfs/sysfs
2023-11-01 15:20:15 +01:00
Lennart Poettering
dba4fa8910 nspawn: make sure idmapped logic works if DDI contains only /usr/ tree
If we have a DDI that contains only a /usr/ tree (and which is thus
combined with a tmpfs for root on boot) we previously would try to apply
idmapping to the tmpfs, but not the /usr/ mount. That's broken of
course.

Fix this by applying it to both trees.
2023-11-01 00:50:43 +00:00
Lennart Poettering
1a8d781495 nspawn: fix barriers when wiping fully visible procfs/sysfs
Let's wait until the child is fully done with mounting it's own
instances of procfs/sysfs before we destroy our fully visible copies of
it.

This borrows heavily from Christian Brauners fix #29521, but splits the
place + sync into two steps so that the child payload is not started
before the parent has destroyed the procfs instance.

Alternative to: #29521
Fixes: #28157
2023-10-31 15:33:49 +01:00
Lennart Poettering
7113640493 fd-uitl: rename PIPE_EBADF → EBADF_PAIR, and add EBADF_TRIPLET
We use it for more than just pipe() arrays. For example also for
socketpair(). Hence let's give it a generic name.

Also add EBADF_TRIPLET to mirror this for things like
stdin/stdout/stderr arrays, which we use a bunch of times.
2023-10-26 22:30:42 +02:00
Raul Cheleguini
5e21da878c nspawn: Make parameter provided_mac a const for setup_veth() 2023-10-26 21:17:29 +01:00
Raul Cheleguini
813dbff4d5 nspawn: allow user-specified MAC address on container side
Introduce the environment variable SYSTEMD_NSPAWN_NETWORK_MAC to allow
user-specified MAC address on container side.
2023-10-25 13:59:46 +01:00
Frantisek Sumsal
4820c9d417 fuzz: unify logging setup
Make sure we don't log anything when running in "fuzzing" mode. Also,
when at it, unify the setup logic into a helper, pretty similar to
the test_setup_logging() one.

Addresses:
  - https://github.com/systemd/systemd/pull/29558#pullrequestreview-1676060607
  - https://github.com/systemd/systemd/pull/29558#discussion_r1358940663
2023-10-19 10:05:20 +01:00
Nick Rosbrook
869c1cf88f nspawn: check if we can set CoredumpReceive= before doing so
If systemd-nspawn is newer than the running systemd, we might try to set
CoredumpReceive=yes when systemd doesn't know about it yet. Try and
check if the running systemd is aware of this setting, and if not, don't
try and use it.

Fixes 411d8c72ec
("nspawn: set CoredumpReceive=yes on container's scope when --boot is set").
2023-10-16 22:53:50 +02:00
Nick Rosbrook
411d8c72ec nspawn: set CoredumpReceive=yes on container's scope when --boot is set
When --boot is set, and --keep-unit is not, set CoredumpReceive=yes on
the scope allocated for the container. When --keep-unit is set, nspawn
does not allocate the container's unit, so the existing unit needs to
configure this setting itself.

Since systemd-nspawn@.service sets --boot and --keep-unit, add
CoredumpReceives=yes to that unit.
2023-10-13 15:28:50 -04:00
Zbigniew Jędrzejewski-Szmek
f95c9f46e2 nspawn: drop unnecessary wrapper functions
The naming was confused: suffix 'p' means that the function takes a pointer to
the type that the wrapped function takes. (E.g., a char**, for a wrapped function
taking a char*.)  But DEFINE_TRIVIAL_DESTRUCTOR() just changes the return type.

Also add one more assert for consistency.
2023-10-06 16:45:49 +02:00
Lennart Poettering
7eda208ffe tree-wide: prefer sending pifds over pids when creating scope units 2023-10-05 17:10:00 +02:00
Yu Watanabe
fcdd21ec6a tree-wide: fix typo 2023-10-04 08:58:10 +09:00
Lennart Poettering
8d9a1d5979 dissect-image: optionally allow mounting via new kernel mount API in two steps
This adds support for the new fsmount() logic of the kernel: we'll first
create an unattached fsmount fd, and then in a second step attach this
to some real file system inode – as opposed to attaching file system
directly. The benefit of this is that we can pass the open fsmount fds
over some sockets if need be, to isolate the mounting code from the
attaching code.
2023-10-02 14:02:32 +01:00
Mike Yuan
e22c60a9d5 io-util: introduce loop_write_full that takes a timeout
Also drop do_poll as the use case is covered
by timeout.
2023-09-07 20:30:44 +08:00
Zbigniew Jędrzejewski-Szmek
aea3f594db various: use id128_from_string_not_null()
No functional change. In config_parse_address_generation_type() we would set
the output parameter and then say it's ignored, so it _looked_ like an error in
the code, but the variable was always initialized to SD_ID128_NULL anyway, so
the code was actually fine.
2023-09-02 14:16:25 +03:00
Yu Watanabe
927e20fa49 nspawn: check validity of the internal interface name only explicitly specified
Follow-up for 2f091b1b49.

Fixes #28844.
2023-08-24 15:55:32 +02:00
Lennart Poettering
e2fc0a7222 tree-wide: don't ifdef seccomp-util.h, drop seccomp.h inclusion everywhere
seccomp-util.h doesn't need ifdeffing, hence don't. It has worked since
quite a while with HAVE_SECCOMP is off, hence use it everywhere.

Also drop explicit seccomp.h inclusion everywhere (which needs
HAVE_SECCOMP ifdeffery everywhere). seccomp-util.h includes it anyway,
automatically, which we can just rely on, and it deals with HAVE_SECCOMP
at one central place.
2023-08-21 18:50:29 +02:00
David Tardon
9aad490e53 tree-wide: use LIST_CLEAR() 2023-08-17 09:48:17 +02:00
Zbigniew Jędrzejewski-Szmek
3c098014f5 nspawn,shared: make ERRNO_IS_SECCOMP_FATAL an inline func with _NEG_ variant
Also rebreak comments and lines.

No functional change.
2023-08-16 12:52:56 +02:00
Zbigniew Jędrzejewski-Szmek
bb44fd0734 various: use _NEG_ macros to reduce indentation
No functional change intended.
2023-08-16 12:52:56 +02:00
Xiaotian Wu
f9d3fb6b5e seccomp: add LoongArch 64bit support 2023-08-09 08:50:07 +08:00
Yu Watanabe
cbc55c4cce meson: also merge declarations of fuzzers with other executables 2023-08-03 20:37:16 +09:00
Yu Watanabe
130c87b16a meson: merge declarations of normal and test executables 2023-08-03 20:37:16 +09:00
Yu Watanabe
eb51c09d13 meson: move declarations of modules-load, nspawn, update-done, and update-utmp 2023-08-01 21:37:31 +09:00
Dmitry V. Levin
5cfc190520 nspawn,shared: cleanup use of ERRNO_IS_SECCOMP_FATAL()
Given that ERRNO_IS_SECCOMP_FATAL() also matches positive values,
make sure this macro is not called with arguments that do not have
errno semantics.

In this case the arguments passed to ERRNO_IS_SECCOMP_FATAL() are the
values returned by external libseccomp function seccomp_load() which is
not expected to return any positive values, but let's be consistent
anyway and move ERRNO_IS_SECCOMP_FATAL() invocations to the branches
where the return values are known to be negative.
2023-07-28 12:28:35 +00:00
Dmitry V. Levin
92a702b114 nspawn: cleanup use of ERRNO_IS_NOT_SUPPORTED()
Given that ERRNO_IS_NOT_SUPPORTED() also matches positive values,
make sure this macro is not called with arguments that do not have
errno semantics.

In this case the argument passed to ERRNO_IS_NOT_SUPPORTED() is the
value returned by remount_idmap() which is not expected to return
any positive values, but let's be consistent anyway and move the
ERRNO_IS_NOT_SUPPORTED() invocation to the branch where
the return value is known to be negative.
2023-07-28 12:28:35 +00:00
Zbigniew Jędrzejewski-Szmek
da89046643 tree-wide: "<n>bit" → "<n>-bit"
In some places, "<n> bits" is used when more appropriate.
2023-07-02 11:10:12 +01:00
Lennart Poettering
8c3fe1b5b5 process-util: add simple wrapper around PR_SET_CHILD_SUBREAPER
Let's a simple helper that knows how to deal with PID == 1.
2023-06-23 10:05:16 +02:00
Lennart Poettering
19b761a097 tree-wide: getpid() → getpid_cached()
This doesn't really matter, but let's be systematic and prefer
getpid_cached() in our codebase.
2023-06-22 17:07:59 -06:00