systemd

mirror of https://github.com/morgan9e/systemd synced 2026-04-15 00:47:10 +09:00

Author	SHA1	Message	Date
gerblesh	bbec1c87d3	sysext: set SELinux context for hierarchies and workdir	2024-11-26 17:47:32 +00:00
Zbigniew Jędrzejewski-Szmek	d293fade24	Check inode number to see if we are in init namespace (#35306 ) This is a more comprehensive fix compared to #35273. Also adds a minimal test only. Based on Luca's #35273 but generalizes the code a bit. In v258 we really should get rid of the old heuristics around userns and cgroupns detection, but given we are late in the v257 cycle this keeps them in.	2024-11-25 14:13:36 +01:00
Yu Watanabe	56c761f8c6	namespace-util: handle -ENOSPC by userns_acquire() gracefully in is_idmapping_supported() (#35313 ) Follow-up for `edae62120f`. Fixes #35311.	2024-11-23 17:32:23 +09:00
Yu Watanabe	3dda236c5c	basic/linux: update kernel headers from v6.12	2024-11-23 17:31:12 +09:00
Lennart Poettering	a2429f507c	virt: make use of ns inode check in running_in_userns() and running_in_cgroupns() too	2024-11-23 00:14:20 +01:00
Luca Boccassi	193bf42ab0	detect-virt: check the inode number of the pid namespace The indoe number of root pid namespace is hardcoded in the kernel to 0xEFFFFFFC since 3.8, so check the inode number of our pid namespace if all else fails. If it's not 0xEFFFFFFC then we are in a pid namespace, hence a container environment. Fixes https://github.com/systemd/systemd/issues/35249 [Reworked by Lennart, to make use of namespace_is_init()]	2024-11-23 00:14:20 +01:00
Lennart Poettering	18ead2b03d	namespace-util: add generic namespace_is_init() call	2024-11-23 00:14:20 +01:00
Yu Watanabe	2994ca354b	namespace-util: update log messages	2024-11-23 06:52:48 +09:00
Yu Watanabe	eb14b993bb	namespace-util: handle -ENOSPC by userns_acquire() gracefully in is_idmapping_supported() Follow-up for `edae62120f`. Fixes #35311.	2024-11-23 06:52:38 +09:00
Luca Boccassi	b7eefa1996	cgroup-util: fix memory leak on error CID#1565824 Follow-up for `f6793bbcf0`	2024-11-21 14:02:34 +09:00
Lennart Poettering	f6793bbcf0	killall: gracefully handle processes inserted into containers via nsenter -a "nsenter -a" doesn't migrate the specified process into the target cgroup (it really should). Thus the cgroup will remain in a cgroup that is (due to cgroup ns) outside our visibility. The kernel will report the cgroup path of such cgroups as starting with "/../". Detect that and print a reasonably error message instead of trying to resolve that.	2024-11-20 18:11:38 +00:00
Mike Yuan	f87863a8ff	process-util: refuse to operate on remote PidRef Follow-up for `7e3e540b88`	2024-11-20 18:10:26 +00:00
Mike Yuan	eea9d3eb10	basic/user-util: split out placeholder suppression from USER_CREDS_CLEAN into its own flag No functional change, preparation for later commits.	2024-11-19 00:38:18 +01:00
Mike Yuan	579ce77ead	basic/user-util: introduce shell_is_placeholder() helper	2024-11-19 00:38:18 +01:00
Mike Yuan	c8590ad60d	process-util: refuse FORK_DETACH + FORK_DEATHSIG_* There's no synchoronization between the intermediate process and the double-forked child, and the semantics are not useful. Refuse such combination.	2024-11-14 12:22:15 +00:00
Lennart Poettering	9466fe014f	namespace-util: pin pid via pidfd during namespace_open()	2024-11-13 14:18:05 +00:00
Yu Watanabe	d762b14e38	audit-util: return -ENODATA from audit_{session\|loginuid}_from_pid() if invoked in a container (#35072 ) The auditing subsystem is still not virtualized for containers, hence the two values don't really make sense inside them, they will just leak information from outside into the container. Hence don't make use of the data if we detect we are run inside of a container. This has visible effects: logind will no longer try to reuse the auditing session ids as its own session ids when run inside a container. While are at it, modernize the calls in more ways: 1. switch to pidref behaviour, all but one of our uses are using pidref anyway already. 2. use read_virtual_file() + proc_mounted() 3. reasonably distinguish ENOENT errors when reading the process proc files: distinguish the case where /proc is not mounted, from the case where the process is already gone, from where auditing is not enabled in the kernel build.	2024-11-13 10:08:29 +09:00
Lennart Poettering	c892816ceb	run0: when changing privileges to non-root, do not show superhero emoji Let's show an idcard logo instead, to indicate that we changed ids.	2024-11-12 23:09:21 +01:00
Lennart Poettering	7bf0149e9b	process-util: more gracefully handle oom adjust parsing/setting Who knows what kind of mount shenanigans people employ, let's gracefully handle parse failures of proc files, like we alway do otherwsie.	2024-11-12 23:03:40 +01:00
Lennart Poettering	68c554f23a	audit-util: modernize use_audit() a bit Use ERRNO_IS_xyz() macros where appropriate. Also, reduce indentation a bit by inverted early check. And log in more error codepaths.	2024-11-12 23:03:40 +01:00
Lennart Poettering	7e02ee98d8	audit-util: return -ENODATA from audit_{session\|loginuid}_from_pid() if invoked in a container The auditing subsystem is still not virtualized for containers, hence the two values don't really make sense inside them, they will just leak information from outside into the container. Hence don't make use of the data if we detect we are run inside of a container. This has visible effects: logind will no longer try to reuse the auditing session ids as its own session ids when run inside a container. While are at it, modernize the calls in more ways: 1. switch to pidref behaviour, all but one of our uses are using pidref anyway already. 2. use read_virtual_file() + proc_mounted() 3. reasonable distinguish ENOENT errors when reading the process proc files: distinguish the case where /proc is not mounted, from the case where the process is already gone, from where auditing is not enabled in the kernel build.	2024-11-12 23:03:03 +01:00
Lennart Poettering	56933f2073	uid-classification: properly classify all container UIDs A bit confusingly CONTAINER_UID_BASE_MAX is just the maximum base UID for a container. Thus, with the usual 64K UID assignments, the last actual container UID is CONTAINER_UID_BASE_MAX+0xFFFF. To make this less confusing define CONTAINER_UID_MIN/MAX that add the missing extra space. Also adjust two uses where this was mishandled so far, due to this confusion. With this change the UID ranges we default to should properly match what is documented on https://systemd.io/UIDS-GIDS/.	2024-11-08 23:18:39 +00:00
Lennart Poettering	af3baf174a	fs-util: add comment about XO_NOCOW	2024-11-08 09:21:25 +01:00
Ivan Kruglov	a567de392d	process-util: introduce report_errno_and_exit() as part of src/basic/process-util.{h,c}	2024-11-06 11:18:38 +01:00
Andres Beltran	f348831d27	namespace-util: make idmapping not supported if syscalls return EPERM	2024-11-06 09:27:33 +01:00
Zbigniew Jędrzejewski-Szmek	2257be13fe	tree-wide: time-out → timeout For justification, see `3f9a0a522f`.	2024-11-05 19:32:19 +00:00
Luca Boccassi	7af37f3a90	Add PrivatePIDs= (continued) (#34940 )	2024-11-05 18:42:28 +00:00
Daan De Meyer	406f177501	core: Introduce PrivatePIDs= This new setting allows unsharing the pid namespace in a unit. Because you have to fork to get a process into a pid namespace, we fork in systemd-executor to get into the new pid namespace. The parent then sends the pid of the child process back to the manager and exits while the child process continues on with the rest of exec_invoke() and then executes the actual payload. Communicating the child pid is done via a new pidref socket pair that is set up on manager startup. We unshare the PID namespace right before the mount namespace so we mount procfs correctly. Note PrivatePIDs=yes always implies MountAPIVFS=yes to mount procfs. When running unprivileged in a user session, user namespace is set up first to allow for PID namespace to be unshared. However, when running in privileged mode, we unshare the user namespace last to ensure the user namespace does not own the PID namespace and cannot break out of the sandbox. Note we disallow Type=forking services from using PrivatePIDs=yes since the init proess inside the PID namespace must not exit for other processes in the namespace to exist. Note Daan De Meyer did the original work for this commit with Ryan Wilson addressing follow-ups. Co-authored-by: Daan De Meyer <daan.j.demeyer@gmail.com>	2024-11-05 05:32:02 -08:00
Lennart Poettering	cb42df5310	sd-daemon: add fd array size safety check to sd_notify_with_fds() The previous commit removed the UINT_MAX check for the fd array. Let's now re-add one, but at a better place, and with a more useful limit. As it turns out the kernel does not allow passing more than 253 fds at the same time, hence use that as limit. And do so immediately before calculating the control buffer size, so that we catch multiplication overflows.	2024-11-04 12:10:09 +01:00
Daan De Meyer	a07864a4fe	bootctl: Add --secure-boot-auto-enroll When specified, bootctl install will also set up secure boot auto-enrollment. For now, We sign all variables using the same certificate and key pair.	2024-11-03 10:46:17 +01:00
Daan De Meyer	d5c12da904	efivars: Remove STRINGIFY() helper macros The names of these conflict with macros from efi.h that we'll move to efi-fundamental.h in a later commit. Let's avoid the conflict by getting rid of these helpers. Arguably this also improves readability by clearly indicating we're passing arbitrary strings and not constants to the macros when we invoke them.	2024-11-02 23:20:57 +01:00
Andres Beltran	edae62120f	namespace-util: add util function to check if id-mapped mounts are supported for a given path	2024-11-01 18:41:27 +00:00
Luca Boccassi	fdccba15be	util-lib/systemd-run: implement race-free PTY peer opening (#34953 ) This makes use of the new TIOCGPTPEER pty ioctl() for directly opening a PTY peer, without going via path names. This is nice because it closes a race around allocating and opening the peer. And also has the nice benefit that if we acquired an fd originating from some other namespace/container, we can directly derive the peer fd from it, without having to reenter the namespace again.	2024-11-01 11:29:19 +00:00
Luca Boccassi	d86e9b64e4	tweaks to ANSI sequence (OSC) handling (#34964 ) Fixes: #34604 Prompted by that I realized we do not correctly recognize both "ST" sequences we want to recognize, fix that.	2024-11-01 11:18:57 +00:00
Lennart Poettering	0e3e075b56	iovw: normalize destructors instead of passing a boolean picking the destruction method just have different functions. That's much nicer in context of _cleanup_, and how we usually do things.	2024-10-31 23:08:11 +01:00
Lennart Poettering	811aa36ab6	iovw: add simpler iovw_done() destructor	2024-10-31 23:08:11 +01:00
Lennart Poettering	2865561eaa	coredump: move to _cleanup_ for destroying iovw object	2024-10-31 23:08:11 +01:00
Lennart Poettering	960b045875	coredump: parse signal number at the same time as parsing other fields	2024-10-31 23:08:11 +01:00
Lennart Poettering	5ca96e2717	machine: several follow-ups for recent change (#34882 ) Follow-ups for #34761.	2024-10-31 21:43:18 +01:00
Mike Gilbert	ff94426f8a	posix_spawn_wrapper: do not set POSIX_SPAWN_SETSIGDEF flag Setting this flag is a noop without a corresponding call to posix_spawnattr_setsigdefault. If we call posix_spawnattr_setsigdefault with a full signal set, it causes glibc's posix_spawn implementation to call sigaction 63 times, once for each signal. That seems wasteful. This feature is really only useful for signals which have their disposition set to SIG_IGN. Otherwise the dispostion gets set to SIG_DFL automatically, either by clone(CLONE_CLEAR_SIGHAND) or the subsequent execve. As far as I can tell, systemd does not have any signals set to SIG_IGN under normal operating conditions.	2024-10-31 18:16:58 +01:00
Lennart Poettering	a39c51799b	string-util: also check for 0x1b 0x5c ST when stripping ANSI from strings	2024-10-31 11:38:18 +01:00
Lennart Poettering	0367424786	terminal-util: define ANSI_OSC as macro for the OSC terminal sequence prefix	2024-10-31 11:38:18 +01:00
Lennart Poettering	b8311af810	tree-wide: prefer generating 0x1B 0x5C as ANSI sequence "ST" OSC sequences can be closed with one of three terminators: 1. ASCII code 7, aka BEL, aka ^G, aka \x07, aka \a 2. ASCII code 156, aka \x9c 2. Pair of ASCII code 27 followed by ASCII code 92, aka \x1b\x5c Of these, in some corner case scenarios BEL makes problem (see #34604). Hence switch away from that wherever we use it, and prefer the \x1b\x5c instead. That's preferable over \x9c, since the latter is also a valid UTF-8 codepoint. See discussion here for example: https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda#the-escape-sequence Fixes: #34604	2024-10-31 11:38:08 +01:00
Lennart Poettering	e65b0904a0	string-util: it's called OSC sequence, not CSO sequence	2024-10-31 11:28:57 +01:00
Yu Watanabe	7633001cdd	env-util: introduce strv_env_get_merged()	2024-10-31 11:02:35 +09:00
Yu Watanabe	e4d477efc6	env-util: replace 'char ' with 'char'	2024-10-31 11:02:35 +09:00
Lennart Poettering	fc9dc71a3f	terminal-util: add pty_open_peer() helper This opens a pty peer in one go, and uses the new race-free TIOCGPTPEER ioctl() to do so – if it is available.	2024-10-30 22:37:44 +01:00
Lennart Poettering	fbd2679f66	terminal-util: various minor modernizations Various fixes: 1. Adds O_CLOEXEC for two socketpair()s where we forgot it. 2. Uses FORK_WAIT instead of manual wait_for_terminate_and_check() invocations. 3. Prefix opaque NULL/0 arguments with comments what they are. 4. Add a banch of assert()s, and change flag validation in open_terminal() to be assert() (since flags mistakes are programming errors, not runtime errors).	2024-10-30 22:15:56 +01:00
Yu Watanabe	f7804c1aa2	basic/missing: add short comment about when CLONE_NEWCGROUP is added	2024-10-26 13:59:19 +09:00
Integral	ddb8a639d5	tree-wide: replace for loop with FOREACH_ELEMENT or FOREACH_ARRAY macros (#34893 )	2024-10-26 07:10:22 +09:00

1 2 3 4 5 ...

6446 Commits