Commit Graph

79083 Commits

Author SHA1 Message Date
Daan De Meyer
798b9fb7eb HACKING: Move OBS section further down
HACKING.md should first and foremost tell someone how to hack on
systemd, installing packages from OBS isn't the most likely section
a new contributor will be interested in, so let's move it further
down.
2025-01-24 17:28:15 +01:00
Lennart Poettering
b3f5881c61 homectl: minor man page improvements (#36148) 2025-01-24 15:32:27 +01:00
Lennart Poettering
b0c38476eb homectl: move --umask=/--access-mode= help/man sections
These don't really have much to do about resource mgmt, but are more
about security, hence let's move them away from the resource mgmt
section.
2025-01-24 14:54:37 +01:00
Lennart Poettering
cd3730524c man: add some sections to homectl man page
This adds the same sections we already have in the homectl --help blurb
also to the man page.

While we are at it, let's also add a new section for Authentication
related switches.
2025-01-24 14:54:27 +01:00
Luca Boccassi
921ead1749 mkosi: update debian commit reference
* 4447d2974d Update changelog for 257.2-3 release
* 4b1c65b905 libudev1: add udeb back to shlibs
* 1974e3d06e systemd-boot: always check that the boot entry is set, even with Shim is already installed
* 9a5eea9823 systemd-boot: use boot entry argument instead of installing as grub.efi on ESP
* df6efeed46 libsystemd-dev/libudev-dev: depend on libcap-dev
* 5673b771e1 signing template: add override for executable-not-elf-or-script
* 3f109637c4 Update changelog for 257.2-2 release
* 42f4afa605 Drop udeb packages
* c04f7f2b16 signing template: always set urgency to 'high'
* 9bd8b5228b Set SBAT info for upstream build
* 257ba8563b udev: link to libsystemd-shared when building with noudeb profile
* 8ca2b26678 Link systemctl against libsystemd-shared
* 1a4a8af0c2 Install jq for pkg.systemd.upstream too since the template packages are now built
* 6fd0d2698d signing template: fix Lintian warnings and errors
* c79d10bbaa Build template packages for pkg.systemd.upstream profile, for OBS builds
* 485a867438 d/t/upstream: take into account autopkgtest pinning
* c1b6e565e3 Update README.source in the signing-template
* 17d1b92d9f d/t/control: remove 'flaky' from tests-in-lxd
* 2a36f6f5e1 Do not install sd-resolved and drop breaks-testbed from fast tests
* a3cb52f8d0 Enable UEFI on loong64
* ad7a943023 Enable libseccomp on loong64 and hppa
* 9d24f84ed5 Update changelog for 257.2-1 release
* f47619c9f4 Drop all patches, merged upstream
* d4aa6545a6 Install new files for upstream CI
* 5775daa46e d/rules: support building in OBS from git
2025-01-24 22:14:41 +09:00
Yu Watanabe
6430761685 sd-device: fix typo
Follow-up for 8d89667aba.
2025-01-24 22:13:03 +09:00
Yu Watanabe
3fa12d2cab mntfsd: fix typo
Follow-up for d6f8e1ae87.
2025-01-24 22:12:04 +09:00
Yu Watanabe
1970bf7448 pam_systemd: fix typo
Follow-up for 30de569174.
2025-01-24 22:10:35 +09:00
Yu Watanabe
c1e0518064 strv: fix typo
Follow-up for 5072f4268b.
2025-01-24 22:08:56 +09:00
Lennart Poettering
d6b008b01e Enforce per-user quota on /tmp/ and /dev/shm/ as user logs in (#36010)
There's finally quota on tmpfs, hence let's use it to make it harder for
users to DoS the system by consuming all disk space in /tmp/ and
/dev/shm/.

This enforces a default limit of 80% quota of the backing fs for these
two dirs for users, but this can be overriden in the user record, if
desired.

This also adds two other interesting features:

1. mount units gain GracefulOptions= which takes optional mount options
that are added only if supported by the kernel. (this is used to enable
usrquota on /tmp/, if available.)
2. The PAM logic in service management now supports reading passwords
from service credentials and via the askpw logic. This used for make
testing easy (so that we can run0 into a homed user which strictly
requires a password).
2025-01-24 12:52:27 +01:00
Daan De Meyer
8dab59e610 mkosi: Drop usage of _systemd_QUIET in arch build script
We dropped the variable in the packaging specs for Arch to keep the
integration points as minimal as possible so let's stop using it in
the build script as well.
2025-01-24 12:08:27 +01:00
Luca Boccassi
3f9539a97f test: split VM-only subtests from TEST-74-AUX-UTILS to new VM-only test
TEST-74-AUX-UTILS covers many subtests, as it's a catch-all job, and a few
need a VM to run. The job is thus marked VM-only. But that means in settings
where we can't run VM tests (no KVM available), the entire thing is skipped,
losing tons of coverage that doesn't need skipping.

Move the VM-only subtests to TEST-87-AUX-UTILS-VM that is configured to only
run in VMs under both runners. This way we keep the existing tests as-is, and
we can add new VM-only tests without worrying. This is how the rest of the
tests are organized.

Follow-up for f4faac2073
2025-01-24 08:37:51 +01:00
Lennart Poettering
2635b5dc4a nspawn: support unpriv directory-tree containers (#35685)
So far nspawn supported unpriv containers only if backed by a DDI. This
adds dir-based unpriv containers too.

To make this work this introduces a new UID concept to systemd: the
"foreign UID range". This is a high UID range of size 64K. The idea is
that disk images that are "foreign" to the local system can use that,
and when a container or similar is invoked from it, a transiently
allocated dynamic UID range is mapped from that foreign UID range via id
mapped mounts.

This means the fully dynamic, transient UID ranges never hit the disk,
which should vastly simplify management, and does not require that uid
"subranges" are persistently delegated to any users.

The mountfsd daemon gained a new method call for acquiring an idmapped
mount fd for an mount tree owned by the foreign UID range. Access is
permitted to unpriv clients – as long as the referenced inode is located
within a dir owned by client's own uid range.
2025-01-23 23:34:37 +01:00
Lennart Poettering
66ea74017c Three minor refactorings for userdb code (#36141)
Nothing earth shattering, but some minor refactorings split out of and
preparation for #36133
2025-01-23 23:04:48 +01:00
Lennart Poettering
0d9ee9220b sd-varlink/sd-json: add two new API calls (#36137)
These are kinda no-brainers, should have always existed. 

Split out of #36133 which needs them.
2025-01-23 22:40:19 +01:00
Lennart Poettering
0054b7dce9 update TODO 2025-01-23 22:36:39 +01:00
Lennart Poettering
d58d449fc6 test: add test case for tmpfs quota logic + PAMName= ask-password logic 2025-01-23 22:36:39 +01:00
Lennart Poettering
2b2aebf4dd homectl: add support for configuring tmpfs limits 2025-01-23 22:36:39 +01:00
Lennart Poettering
b1c95fb2e9 user-runtime-dir: enforce /tmp/ and /dev/shm/ quota
Enforce the quota on these two tmpfs at the same place where we mount
the per-user $XDG_RUNTIME_DIR. Conceptually these are very similar
concepts, and it makes sure to enforce the limits at the same place with
the same lifecycle.
2025-01-23 22:36:39 +01:00
Lennart Poettering
9ef12bc1d7 user-runtime-dir: some smaller modernizations/refactorings 2025-01-23 22:36:28 +01:00
Lennart Poettering
72b932aac0 user-record: add fields for setting limits on /tmp/ and /dev/shm/ 2025-01-23 22:16:24 +01:00
Lennart Poettering
d15811d7e5 devnum-util: add macros to safely convert dev_t to pointers and back
Sometimes it's nice being able to store dev_t as pointer values in
hashmaps/tables, instead of having to allocate memory for them and using
devt_hash_ops. After all dev_t is weird on Linux/glibc: glibc defines it
as 64bit entity (which hence appears as something we cannot encode in a
pointer value for compat with 32bit archs) but it actually is 32bit in
the kernel apis. Hence we can safely cut off the upper 32bit, and still
retain compat with all archs.

But let's hide this in new macros, and validate this is all correct via
a test.
2025-01-23 22:16:24 +01:00
Lennart Poettering
ab659a685e update TODO 2025-01-23 21:48:02 +01:00
Lennart Poettering
db5c4a4503 test: test comprehensive tests for new (and old) nspawn userns modes 2025-01-23 21:48:02 +01:00
Lennart Poettering
65664bba40 man: document new nspawn functionality around unpriv support 2025-01-23 21:48:02 +01:00
Lennart Poettering
46b7e96783 nspawn: add support for 'managed' userns mode even when we run privileged
So far, we supported two modes:

1. when running unpriv we'd get the mounts from mountfsd, and the userns
   from nsresourced
2. when running priv we'd do the mounts/userns ourselves

This untangles this a bit, so that we can also use mountfsd/nsresourced
when running privilged.

I think this is generally a bit nicer, and probably something we should
switch to entirely one day, as it reduces the variety of codepaths.

With this patch the default behaviour remains unchanged, but by
selecting the new "managed" option for --private-users= the codepaths
via mountfsd/nsresourced can be explicitly requested even when running
with privs.

This is mostly just reworks that we check for arg_userns_mode !=
USER_NAMESPACE_MANAGED rather than arg_privileged for a number of
codepaths, but requires more fixes, too. The devil is in the details.
2025-01-23 21:48:02 +01:00
Lennart Poettering
ca23deae09 nspawn: support foreign mappings also when nspawn doing the mapping itself
This adds a new "foreign" value to --private-users-ownership= which is a
lot like "map", but maps from the host's foreign UID range rather than from the
host's 0.

(This has nothing much to do with making unprivileged directory-based
containers work, it's just very handy that we can run privileged
contains with such a mapping too, with an easy switch)
2025-01-23 21:48:02 +01:00
Lennart Poettering
88252ca889 nspawn: allow to run unpriv from dir
This simply calls into mountfsd to acquire the root mount and uses it as
root for the container.

Note that this also makes one more change: previously we ran containers
directory off their backing directory. Except when we didn't, and there
were a variety of exceptions: if we had no privs, if we ran off a disk
image, if the directory was the host's root dir, and some others.

This simplifies the logic a bit: we now simply always create a temporary
directory in /tmp/ and bind mount everything there, in all code paths.
This simplifies our code a bit. After all, in order to control
propagation we need to turn the root into a mount point anyway, hence we
might just do it at one place for all cases.
2025-01-23 21:48:02 +01:00
Lennart Poettering
e57f99305e dissect-image: add client side API wrapper for MountDirectory() varlink call
This is simply a Varlink API client that taks a directory path and
userns fd and returns a mount fd.
2025-01-23 21:48:02 +01:00
Lennart Poettering
d6f8e1ae87 mntfsd: add api to mount dirs for containers
systemd-mountfsd so far provided a MountImage() API call for mounting a
disk image and returning a set of mount fds. This complements the API
with a new MountDirectory() API call, that operates on a directory
instead of an image file. Now, what makes this interesting is that it
applies an idmapping from the foreign UID range to the provided target
userns – and in which case unpriveleged operation is allowed (well,
under some conditions: in particular the client must own a parent dir of
the provided path).

This allows container managers to run fully unprivileged from
directories – as long as those directories are owned by the foreign UID
range. Basic operation is like this:

1. acquire a transient userns from systemd-nsresourced with 64K users
2. ask systemd-mountfsd for an idmapped mount of the container dir
   matching that userns
3. join the userns and bind the mount fd as root.

Note that we have to drop various sandboxing knobs from the mountfsd
service file for this to work, since the kernel's security checks that
try to ensure than an obstructed /proc/ cannot be circumvented via
mounting a new procfs will otherwise prohibit mountfsd to duplicate the
mounts properly.
2025-01-23 21:48:02 +01:00
Lennart Poettering
83eabe102a user-record: make a NULL UserDBMatch be equivalent to no filtering 2025-01-23 21:32:12 +01:00
Lennart Poettering
6a43f0a73c userdb: move setting of 'service' varlink parameter into userdb_connect()
We currently set this at two distinct places right before calling
userdb_connect(). let's do this inside of userdb_connect() instead, and
derive it directly from the socket path.

This doesn't change behaviour but simplifies things a bit.
2025-01-23 21:32:12 +01:00
Lennart Poettering
45e587d822 userdbd: separate parameter structure of GetMemberships() varlink call from the GetUserRecord() one
The GetUserRecord() and GetMemberships() have quite different arguments,
hence let's use separate structures for both.

This makes sense on its own, since it makes the structures a bit
smaller, but is also preparation for a later commit that adds a bunch of
new fields to one of the structs but not the other.
2025-01-23 21:32:12 +01:00
Lennart Poettering
25c24619db sd-varlink: add sd_varlink_get_description() call 2025-01-23 21:28:02 +01:00
Lennart Poettering
b6a2df6307 sd-json: add new sd_json_variant_unset_field() call 2025-01-23 21:27:39 +01:00
Lennart Poettering
16ea491528 docs: mention the two other userdb services we ship these days 2025-01-23 21:13:41 +01:00
Yu Watanabe
544a67c8f7 udev-rules: check OWNER/GROUP= setting more strictly (#36123)
- refuses lines with unknown or invalid user/group,
- refuses non-system user/group in the setting.
2025-01-24 05:09:39 +09:00
Mike Yuan
0dc1716854 creds: permit interactive polkit auth when encrypting/decrypting through IPC 2025-01-24 05:08:12 +09:00
Mike Yuan
f3ba767d6c core/job: fix typo 2025-01-24 05:08:12 +09:00
Yu Watanabe
7e6786b7fb NEWS: mention OWNER=/GROUP= in udev rules now refuses non-system user/group 2025-01-24 02:33:18 +09:00
Yu Watanabe
02ec3dd4ef test: add test cases for OWNER=/GROUP= with non-system user/group 2025-01-24 02:33:18 +09:00
Yu Watanabe
f5cdf9515a udev-rules: ignore non-system user/group in OWNER=/GROUP=
Recently, we introduce 'clock' system group, and set it for rtc/ptp
devices. See af96ccfc24.

However, if non-system group with the same name is already exist,
previously the devices were owned by the non-system group. That may
possibly happen on updating systemd.

Let's avoid accidentally devices being owned by non-system user/group.
2025-01-24 02:33:18 +09:00
Yu Watanabe
a1ee55e3c9 udev-rules: ignore OWNER=/GROUP= with unknown user/group
Previously, when an unknown or invalid user/group is specified,
a token was installed with UID_INVALID/GID_INVALID. That's not only
meaningless in most cases, but also clears previous assignment,
if multiple OWNER=/GROUP= token exist for the same device, e.g.

KERNEL=="sda", GROUP="disk"
KERNEL=="sda", GROUP="nonexistentuser"

This makes when an unknown user/group is specified, the line will be
ignored. Hence, in the above example, the device will be owned by the
group "disk".
2025-01-24 02:33:18 +09:00
Yu Watanabe
e89eaeb027 udev-rules: get_user_creds()/get_group_creds() return -ESRCH when user/group does not exist
This drops -ENOENT error check for get_user_creds()/get_group_creds(),
as nowadays they always return -ESRCH when the specified user/groups
cannot be found.

This also adds short comments for NULL arguments.
2025-01-24 02:33:18 +09:00
Lennart Poettering
3e7910829e units: modprobe@.service tweaks (#36132) 2025-01-23 18:18:10 +01:00
Yu Watanabe
b7622cbab6 sd-device: chase sysattr and refuse to read/write outside of sysfs (#36004) 2025-01-24 01:58:19 +09:00
Yu Watanabe
e7fdc7644f udevadm: introduce cat command to show udev rules (#35893)
Closes #35818.
2025-01-24 01:49:42 +09:00
Lennart Poettering
71b6f718e2 units: don't load squasfs/erofs kmods explicitly
File system modules should be something the kernel can autoload
automatically, and according to my testing that works fine, hence let's
drop the explicit deps, in particular as systems usually stick to one fs
for these things, not both.

I inquired bluca about the reason to add it, and didn't remember
anymore, and was fine with me removing this. So let's remove this for
now, should issues arise we can revert this.
2025-01-23 16:29:28 +01:00
Lennart Poettering
6f69568cff units: mountfsd needs to pull DM and loop kmods
mountfsd is supposed to be available during early boot aleady, before
systemd-tmpfiles-setup-dev-early.service completes, hence make sure
loopback devices and DM already work before that.

As suggested by yuwata here:

https://github.com/systemd/systemd/pull/35685#issuecomment-2608157569
2025-01-23 16:29:22 +01:00
Lennart Poettering
9fc2126386 units: add a longer comment to modprobe@.service explaining when to use it 2025-01-23 16:29:20 +01:00