systemd

mirror of https://github.com/morgan9e/systemd synced 2026-04-14 16:37:19 +09:00

Author	SHA1	Message	Date
Mike Yuan	3ddbc34e15	process-util: refuse FORK_WAIT + FORK_FREEZE combination	2025-02-21 21:35:05 +00:00
Daan De Meyer	dc2f960b78	process-util: Allow setting ret_pid with FORK_DETACH in safe_fork() Let's allow getting the pid even if the caller sets FORK_DETACH. We do this via a socketpair() over which we send the inner child pid.	2025-02-20 21:00:52 +01:00
Daan De Meyer	f48103ea61	process-util: Implement safe_fork_full() on top of pidref_safe_fork_full() Let's switch things around, and move the internals of safe_fork_full() into pidref_safe_fork_full() and make safe_fork_full() a trivial wrapper on top of pidref_safe_fork_full().	2025-02-20 20:13:53 +01:00
Lennart Poettering	4ace93da8c	pidref: now that we have the cached pidfdid of our own process, use it Note that this drops a lot of "const" qualifiers on PidRef arguments. That's because pidref_is_self() suddenly might end changing the PidRef because it acquires the pidfd ID. We had this previously already with pidfd_equal(), but this amplifies the problem. I guess we C's "const" doesn't really work for stuff that contains caches, that is just conceptually constant, but not actually.	2025-01-20 21:51:40 +01:00
Yu Watanabe	f0159e2b5b	process-util: fix typo Also rebreak comment. Follow-up for `03b89cf213`.	2025-01-19 04:24:08 +09:00
Lennart Poettering	277255e814	process-util: slightly update comment in freeze()	2025-01-16 11:55:21 +01:00
Lennart Poettering	d6267b9b18	process-util: port pid_from_same_root_fs() to pidref, and port three places over to it	2025-01-16 11:55:21 +01:00
Lennart Poettering	6eeeef9f66	process-util: introduce new FORK_FREEZE flag for safe_fork() Often we want to fork off a process that just hangs until we kill it, let's add a simple flag to create one of this type, and use it at various places.	2025-01-16 11:55:21 +01:00
Lennart Poettering	9ef559a036	tree-wide: drop support for kernels without pidfd_open() and pidfd_send_signal() (#35971 )	2025-01-16 11:37:17 +01:00
Yu Watanabe	132a164d97	Follow-ups for recent namespace PRs (#35923 )	2025-01-15 14:10:21 +09:00
Mike Yuan	e755cde735	process-util: depend on CLONE_PIDFD	2025-01-12 00:17:20 +01:00
Lennart Poettering	361327e929	convert more code to PidRef (#35895 )	2025-01-11 23:14:33 +01:00
Mike Yuan	dfef02c675	process-util: drop duplicate assertions	2025-01-11 15:53:13 +01:00
Lennart Poettering	47e45ea738	process-util: make pidref_safe_fork_full() work with FORK_WAIT (This is useful for the test case added in the next commit, where it's kinda nice being able to use pidref_safe_fork_full() and acquiring a pidref of the child in the child in one go. There's no other value in this than a bit of synctactic sugar for that test. But otoh thre's no good reason to prohibit FORK_WAIT use like this, hence either way, this commit should be a good thing.)	2025-01-10 14:14:17 +01:00
Lennart Poettering	9237a63a80	process-util: add new helper pidref_get_ppid_as_pidref()	2025-01-10 14:14:17 +01:00
Ivan Kruglov	03b89cf213	basic: fixes in read_errno() follow ups for https://github.com/systemd/systemd/pull/35880	2025-01-10 11:49:49 +01:00
Lennart Poettering	7893362508	process-util: do not unblock unrelated signals while forking This makes sure when we are blocking signals in preparation for fork() we'll not temporarily unblock any signals previously set, by mistake. It's safe for us to block more, but not to unblock signals already blocked. Fix that. Fixes: #35470	2025-01-10 16:10:31 +09:00
Ivan Kruglov	64db44f7fb	process-util: read_errno()	2025-01-09 10:47:24 +01:00
Lennart Poettering	9ed2725867	process-util: a process from a foreign pidns is definitely not our child Addresses: https://github.com/systemd/systemd/pull/35242#pullrequestreview-2531712318	2025-01-07 08:55:21 +01:00
Mike Yuan	223d455670	process-util: make pid_is_unwaited() wrapper around pidref version	2025-01-04 17:48:22 +01:00
Mike Yuan	47f64104d1	process-util: port pidref_get_uid() and pidref_is_my_child() to pidfd helpers	2025-01-04 17:48:22 +01:00
Mike Yuan	a33f691374	process-util: move namespace_get_leader() to namespace-util This allows us to drop the hack for recursive includes.	2025-01-04 17:08:00 +01:00
Mike Yuan	b234026d09	process-util: extract pidfd-related funcs into pidfd-util.[ch]	2025-01-04 16:58:13 +01:00
Lennart Poettering	25b1a73f71	journald: get rid of get_process_capeff(), use pidref_get_capability() instead This does pretty much the same, but is nicer, since it parses things properly.	2024-12-17 19:06:54 +01:00
Mike Yuan	61263e1436	process-util: make sure we don't report ppid == 0 Previously, if pid == 0 and we're PID 1, get_process_ppid() would set ret to getppid(), i.e. 0, which is inconsistent when pid is explicitly set to 1. Ensure we always handle such case by returning -EADDRNOTAVAIL.	2024-12-11 14:44:08 +01:00
Mike Yuan	07612aab66	process-util: use our usual tristate semantics for is_main_thread() While at it, _unlikely_ is dropped, as requested in https://github.com/systemd/systemd/pull/35242#discussion_r1880096233	2024-12-11 14:44:07 +01:00
Mike Yuan	f87863a8ff	process-util: refuse to operate on remote PidRef Follow-up for `7e3e540b88`	2024-11-20 18:10:26 +00:00
Mike Yuan	c8590ad60d	process-util: refuse FORK_DETACH + FORK_DEATHSIG_* There's no synchoronization between the intermediate process and the double-forked child, and the semantics are not useful. Refuse such combination.	2024-11-14 12:22:15 +00:00
Lennart Poettering	7bf0149e9b	process-util: more gracefully handle oom adjust parsing/setting Who knows what kind of mount shenanigans people employ, let's gracefully handle parse failures of proc files, like we alway do otherwsie.	2024-11-12 23:03:40 +01:00
Ivan Kruglov	a567de392d	process-util: introduce report_errno_and_exit() as part of src/basic/process-util.{h,c}	2024-11-06 11:18:38 +01:00
Daan De Meyer	406f177501	core: Introduce PrivatePIDs= This new setting allows unsharing the pid namespace in a unit. Because you have to fork to get a process into a pid namespace, we fork in systemd-executor to get into the new pid namespace. The parent then sends the pid of the child process back to the manager and exits while the child process continues on with the rest of exec_invoke() and then executes the actual payload. Communicating the child pid is done via a new pidref socket pair that is set up on manager startup. We unshare the PID namespace right before the mount namespace so we mount procfs correctly. Note PrivatePIDs=yes always implies MountAPIVFS=yes to mount procfs. When running unprivileged in a user session, user namespace is set up first to allow for PID namespace to be unshared. However, when running in privileged mode, we unshare the user namespace last to ensure the user namespace does not own the PID namespace and cannot break out of the sandbox. Note we disallow Type=forking services from using PrivatePIDs=yes since the init proess inside the PID namespace must not exit for other processes in the namespace to exist. Note Daan De Meyer did the original work for this commit with Ryan Wilson addressing follow-ups. Co-authored-by: Daan De Meyer <daan.j.demeyer@gmail.com>	2024-11-05 05:32:02 -08:00
Mike Gilbert	ff94426f8a	posix_spawn_wrapper: do not set POSIX_SPAWN_SETSIGDEF flag Setting this flag is a noop without a corresponding call to posix_spawnattr_setsigdefault. If we call posix_spawnattr_setsigdefault with a full signal set, it causes glibc's posix_spawn implementation to call sigaction 63 times, once for each signal. That seems wasteful. This feature is really only useful for signals which have their disposition set to SIG_IGN. Otherwise the dispostion gets set to SIG_DFL automatically, either by clone(CLONE_CLEAR_SIGHAND) or the subsequent execve. As far as I can tell, systemd does not have any signals set to SIG_IGN under normal operating conditions.	2024-10-31 18:16:58 +01:00
Mike Yuan	e06c5be29a	process-util: always retry with pidfd_spawn() w/o cgroup first Follow-up for `7ac58157ca` With the mentioned commit, iff E2BIG we'd retry pidfd_spawn() with POSIX_SPAWN_SETCGROUP disabled. However, the same strategy should actually apply to EOPNOTSUPP/ENOSYS/EPERM too - they can mean two things here: no clone3() or no CLONE_PIDFD. Therefore, let's first try clone() + CLONE_PIDFD, and fall further back to plain clone() (posix_spawn()) only as last resort. Plus, record the fact so that we don't unnecessarily retry every single time if CLONE_PIDFD is the one that's unavailable.	2024-08-21 15:27:57 +02:00
Mike Yuan	df99a8ef3d	process-util: check the flag instead of 'cgroup' param We might skip CLONE_INTO_CGROUP wholly if not supported.	2024-08-21 15:17:05 +02:00
Kornilios Kourtis	7ac58157ca	process-util: handle pidfd_spawn() returning E2BIG In some kernels (specifically, 5.4) even though the clone3 syscall is supported, setting CLONE_INTO_CGROUP is not. The error message returned in this case is E2BIG. If posix_spawn_wrapper encounters this error, it does not retry, and cannot spawn any programs in said kernels. This commit adds a check for the E2BIG error and retries pidfd_spawn() without the POSIX_SPAWN_SETCGROUP flag. If we encounter an E2BIG error, and the pidfd_spawn() succeeds after removing the POSIX_SPAWN_SETCGROUP flag, then we cache the result so that we do not retry every time. Originally, this issue was reported in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1077204. Signed-off-by: Kornilios Kourtis <kornilios@gmail.com>	2024-08-21 02:04:57 +09:00
Mike Yuan	f32538e1cc	basic/process-util: modernize setpriority_closest() Before this commit, the "Cannot raise nice level" branch is rather confusing, as we're actually lowering the nice. Also, it's better to log about the final nice value for both cases, no matter whether we need to set to limit or not.	2024-08-18 15:16:03 +02:00
Mike Yuan	8dc303d3c8	process-util: modernize pidfd_get_pid()	2024-07-21 22:48:53 +02:00
Mike Yuan	6fb97a85c7	process-util: make pid*_get_start_time return usec_t	2024-05-22 18:47:16 +08:00
Mike Yuan	58ff2f1e38	core/execute: also check cg_is_threaded for clone3() Prompted by #32259 We already have this check in exec_invoke(), i.e. child. But if CLONE_INTO_CGROUP is used, the failure would occur on parent's side, so do the check there too.	2024-04-14 23:22:13 +08:00
Zbigniew Jędrzejewski-Szmek	418b936d47	various: use strdup_to() after getenv()	2024-03-20 15:18:21 +01:00
Lennart Poettering	234bdd9c99	process-util: use proc_mounted() check at one more place	2024-02-21 09:25:46 +01:00
Adrian Vovk	85f660d46b	fd-util: Expose helper to pack fds into 3,4,5,... This is useful for situations where an array of FDs is to be passed into a child process (i.e. by passing it through safe_fork). This function can be called in the child (before calling exec) to pack the FDs to all be next to each-other starting from SD_LISTEN_FDS_START (i.e. 3)	2024-02-19 11:18:11 +00:00
Frantisek Sumsal	3dc51ab2cf	process-util: use only the least significant byte from personality() The personality() syscall returns a 32-bit value where the top three bytes are reserved for flags that emulate historical or architectural quirks, and only the least significant byte reflects the actual personality we're interested in (in opinionated_personality()). Use the newly defined mask in the corresponding test as well, otherwise the test fails on some more "exotic" architectures that set some of the "quirk" flags: ~# uname -m armv7l ~# build/test-seccomp ... /* test_lock_personality */ current personality=0x0 safe_personality(PERSONALITY_INVALID)=0x800000 Assertion '(unsigned long) safe_personality(current) == current' failed at src/test/test-seccomp.c:970, function test_lock_personality(). Aborting. lockpersonalityseccomp terminated by signal ABRT. Assertion 'wait_for_terminate_and_check("lockpersonalityseccomp", pid, WAIT_LOG) == EXIT_SUCCESS' failed at src/test/test-seccomp.c:996, function test_lock_personality(). Aborting. Aborted (core dumped) See: personality(2) and comments in sys/personality.h	2024-02-07 19:29:53 +01:00
Mike Yuan	c90335403c	process-util: minor follow-up for pidfd_spawn	2024-02-06 12:26:38 +00:00
Luca Boccassi	2e106312e2	core: add support for pidfd_spawn Added in glibc 2.39, allows cloning into a cgroup and to get a pid fd back instead of a pid. Removes race conditions for both changing cgroups and getting a reliable reference for the child process. Fixes https://github.com/systemd/systemd/pull/18843 Replaces https://github.com/systemd/systemd/pull/16706	2024-02-05 21:52:36 +00:00
Luca Boccassi	9ca13d60db	executor: really set POSIX_SPAWN_SETSIGDEF for posix_spawn posix_spawnattr_setflags() doesn't OR the input to the current set of flags, it overwrites them, so we are currently losing POSIX_SPAWN_SETSIGDEF. Follow-up for: `6ecdfe7d10`	2024-02-05 16:26:01 +00:00
Luca Boccassi	556d2bc4a1	core: use PidRef in exec_spawn	2024-02-01 21:06:14 +00:00
Yu Watanabe	387f39ea30	process-util: introduce FORK_NEW_NETNS for safe_fork() Similar to FORK_NEW_MOUNTNS or FORK_NEW_USERNS.	2024-01-19 15:06:08 +09:00
Rose	aa9ff6c28d	tree-wide: replace string functions with fundamental functions	2024-01-11 13:36:25 +09:00
Lennart Poettering	3b1e80f7cb	process-util: turn off O_NONBLOCK on stdio fds when rearranging fds We often create our fds O_NONBLOCK, but when we want to invoke some program with them as stdin/stdout/stderr we really should turn it off again.	2024-01-08 23:23:42 +01:00

1 2 3 4 5 ...

300 Commits