systemd

mirror of https://github.com/morgan9e/systemd synced 2026-04-14 16:37:19 +09:00

Author	SHA1	Message	Date
Mike Yuan	f87863a8ff	process-util: refuse to operate on remote PidRef Follow-up for `7e3e540b88`	2024-11-20 18:10:26 +00:00
Mike Yuan	c8590ad60d	process-util: refuse FORK_DETACH + FORK_DEATHSIG_* There's no synchoronization between the intermediate process and the double-forked child, and the semantics are not useful. Refuse such combination.	2024-11-14 12:22:15 +00:00
Lennart Poettering	7bf0149e9b	process-util: more gracefully handle oom adjust parsing/setting Who knows what kind of mount shenanigans people employ, let's gracefully handle parse failures of proc files, like we alway do otherwsie.	2024-11-12 23:03:40 +01:00
Ivan Kruglov	a567de392d	process-util: introduce report_errno_and_exit() as part of src/basic/process-util.{h,c}	2024-11-06 11:18:38 +01:00
Daan De Meyer	406f177501	core: Introduce PrivatePIDs= This new setting allows unsharing the pid namespace in a unit. Because you have to fork to get a process into a pid namespace, we fork in systemd-executor to get into the new pid namespace. The parent then sends the pid of the child process back to the manager and exits while the child process continues on with the rest of exec_invoke() and then executes the actual payload. Communicating the child pid is done via a new pidref socket pair that is set up on manager startup. We unshare the PID namespace right before the mount namespace so we mount procfs correctly. Note PrivatePIDs=yes always implies MountAPIVFS=yes to mount procfs. When running unprivileged in a user session, user namespace is set up first to allow for PID namespace to be unshared. However, when running in privileged mode, we unshare the user namespace last to ensure the user namespace does not own the PID namespace and cannot break out of the sandbox. Note we disallow Type=forking services from using PrivatePIDs=yes since the init proess inside the PID namespace must not exit for other processes in the namespace to exist. Note Daan De Meyer did the original work for this commit with Ryan Wilson addressing follow-ups. Co-authored-by: Daan De Meyer <daan.j.demeyer@gmail.com>	2024-11-05 05:32:02 -08:00
Mike Gilbert	ff94426f8a	posix_spawn_wrapper: do not set POSIX_SPAWN_SETSIGDEF flag Setting this flag is a noop without a corresponding call to posix_spawnattr_setsigdefault. If we call posix_spawnattr_setsigdefault with a full signal set, it causes glibc's posix_spawn implementation to call sigaction 63 times, once for each signal. That seems wasteful. This feature is really only useful for signals which have their disposition set to SIG_IGN. Otherwise the dispostion gets set to SIG_DFL automatically, either by clone(CLONE_CLEAR_SIGHAND) or the subsequent execve. As far as I can tell, systemd does not have any signals set to SIG_IGN under normal operating conditions.	2024-10-31 18:16:58 +01:00
Mike Yuan	e06c5be29a	process-util: always retry with pidfd_spawn() w/o cgroup first Follow-up for `7ac58157ca` With the mentioned commit, iff E2BIG we'd retry pidfd_spawn() with POSIX_SPAWN_SETCGROUP disabled. However, the same strategy should actually apply to EOPNOTSUPP/ENOSYS/EPERM too - they can mean two things here: no clone3() or no CLONE_PIDFD. Therefore, let's first try clone() + CLONE_PIDFD, and fall further back to plain clone() (posix_spawn()) only as last resort. Plus, record the fact so that we don't unnecessarily retry every single time if CLONE_PIDFD is the one that's unavailable.	2024-08-21 15:27:57 +02:00
Mike Yuan	df99a8ef3d	process-util: check the flag instead of 'cgroup' param We might skip CLONE_INTO_CGROUP wholly if not supported.	2024-08-21 15:17:05 +02:00
Kornilios Kourtis	7ac58157ca	process-util: handle pidfd_spawn() returning E2BIG In some kernels (specifically, 5.4) even though the clone3 syscall is supported, setting CLONE_INTO_CGROUP is not. The error message returned in this case is E2BIG. If posix_spawn_wrapper encounters this error, it does not retry, and cannot spawn any programs in said kernels. This commit adds a check for the E2BIG error and retries pidfd_spawn() without the POSIX_SPAWN_SETCGROUP flag. If we encounter an E2BIG error, and the pidfd_spawn() succeeds after removing the POSIX_SPAWN_SETCGROUP flag, then we cache the result so that we do not retry every time. Originally, this issue was reported in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1077204. Signed-off-by: Kornilios Kourtis <kornilios@gmail.com>	2024-08-21 02:04:57 +09:00
Mike Yuan	f32538e1cc	basic/process-util: modernize setpriority_closest() Before this commit, the "Cannot raise nice level" branch is rather confusing, as we're actually lowering the nice. Also, it's better to log about the final nice value for both cases, no matter whether we need to set to limit or not.	2024-08-18 15:16:03 +02:00
Mike Yuan	8dc303d3c8	process-util: modernize pidfd_get_pid()	2024-07-21 22:48:53 +02:00
Mike Yuan	6fb97a85c7	process-util: make pid*_get_start_time return usec_t	2024-05-22 18:47:16 +08:00
Mike Yuan	58ff2f1e38	core/execute: also check cg_is_threaded for clone3() Prompted by #32259 We already have this check in exec_invoke(), i.e. child. But if CLONE_INTO_CGROUP is used, the failure would occur on parent's side, so do the check there too.	2024-04-14 23:22:13 +08:00
Zbigniew Jędrzejewski-Szmek	418b936d47	various: use strdup_to() after getenv()	2024-03-20 15:18:21 +01:00
Lennart Poettering	234bdd9c99	process-util: use proc_mounted() check at one more place	2024-02-21 09:25:46 +01:00
Adrian Vovk	85f660d46b	fd-util: Expose helper to pack fds into 3,4,5,... This is useful for situations where an array of FDs is to be passed into a child process (i.e. by passing it through safe_fork). This function can be called in the child (before calling exec) to pack the FDs to all be next to each-other starting from SD_LISTEN_FDS_START (i.e. 3)	2024-02-19 11:18:11 +00:00
Frantisek Sumsal	3dc51ab2cf	process-util: use only the least significant byte from personality() The personality() syscall returns a 32-bit value where the top three bytes are reserved for flags that emulate historical or architectural quirks, and only the least significant byte reflects the actual personality we're interested in (in opinionated_personality()). Use the newly defined mask in the corresponding test as well, otherwise the test fails on some more "exotic" architectures that set some of the "quirk" flags: ~# uname -m armv7l ~# build/test-seccomp ... /* test_lock_personality */ current personality=0x0 safe_personality(PERSONALITY_INVALID)=0x800000 Assertion '(unsigned long) safe_personality(current) == current' failed at src/test/test-seccomp.c:970, function test_lock_personality(). Aborting. lockpersonalityseccomp terminated by signal ABRT. Assertion 'wait_for_terminate_and_check("lockpersonalityseccomp", pid, WAIT_LOG) == EXIT_SUCCESS' failed at src/test/test-seccomp.c:996, function test_lock_personality(). Aborting. Aborted (core dumped) See: personality(2) and comments in sys/personality.h	2024-02-07 19:29:53 +01:00
Mike Yuan	c90335403c	process-util: minor follow-up for pidfd_spawn	2024-02-06 12:26:38 +00:00
Luca Boccassi	2e106312e2	core: add support for pidfd_spawn Added in glibc 2.39, allows cloning into a cgroup and to get a pid fd back instead of a pid. Removes race conditions for both changing cgroups and getting a reliable reference for the child process. Fixes https://github.com/systemd/systemd/pull/18843 Replaces https://github.com/systemd/systemd/pull/16706	2024-02-05 21:52:36 +00:00
Luca Boccassi	9ca13d60db	executor: really set POSIX_SPAWN_SETSIGDEF for posix_spawn posix_spawnattr_setflags() doesn't OR the input to the current set of flags, it overwrites them, so we are currently losing POSIX_SPAWN_SETSIGDEF. Follow-up for: `6ecdfe7d10`	2024-02-05 16:26:01 +00:00
Luca Boccassi	556d2bc4a1	core: use PidRef in exec_spawn	2024-02-01 21:06:14 +00:00
Yu Watanabe	387f39ea30	process-util: introduce FORK_NEW_NETNS for safe_fork() Similar to FORK_NEW_MOUNTNS or FORK_NEW_USERNS.	2024-01-19 15:06:08 +09:00
Rose	aa9ff6c28d	tree-wide: replace string functions with fundamental functions	2024-01-11 13:36:25 +09:00
Lennart Poettering	3b1e80f7cb	process-util: turn off O_NONBLOCK on stdio fds when rearranging fds We often create our fds O_NONBLOCK, but when we want to invoke some program with them as stdin/stdout/stderr we really should turn it off again.	2024-01-08 23:23:42 +01:00
Yu Watanabe	7903567cb7	Merge pull request #30610 from YHNdnzj/logind-serialize-pidref logind: serialize session leader pidfd to fdstore	2024-01-04 23:25:18 +09:00
Mike Yuan	faf0dd4b29	process-util: ensure pidref_is_alive only return ESRCH if not set	2024-01-04 16:19:20 +08:00
Lennart Poettering	3dee63b762	process-util: add new pid{ref,}_get_start_time() helper This also adds a test case that test pidref_safe_fork(), pidref_wait() and related calls.	2024-01-02 17:57:34 +01:00
Lennart Poettering	f17132260f	process-util: add pidref_safe_fork() helper This combines safe_fork() with pidref_set_pid(). Eventually we really should switch this to use CLONE_PIDFD, but as that is not wrapped by glibc yet, it's hard. But this is not crucial anyway, as a child we just forked off can always safely be referenced also by PID, given the reaping is under our own control. A simple test case is added in a follow-up commit.	2024-01-02 17:57:34 +01:00
Lennart Poettering	e9ccae3135	process-util: add new FORK_DEATHSIG_SIGKILL flag, rename FORK_DEATHSIG → FORK_DEATHSIG_SIGTERM Sometimes it makes sense to hard kill a client if we die. Let's hence add a third FORK_DEATHSIG flag for this purpose: FORK_DEATHSIG_SIGKILL. To make things less confusing this also renames FORK_DEATHSIG to FORK_DEATHSIG_SIGTERM to make clear it sends SIGTERM. We already had FORK_DEATHSIG_SIGINT, hence this makes things nicely symmetric. A bunch of users are switched over for FORK_DEATHSIG_SIGKILL where we know it's safe to abort things abruptly. This should make some kernel cases more robust, since we cannot get confused by signal masks or such. While we are at it, also fix a bunch of bugs where we didn't take FORK_DEATHSIG_SIGINT into account in safe_fork()	2023-11-02 14:09:23 +01:00
Lennart Poettering	eefb7d22ce	process-util: add API for enumerating processes in /proc/ and pinning them via PidRef	2023-10-18 14:49:40 +02:00
Lennart Poettering	4d9f092b5e	process-util: add pidref_is_unwaited() and make pid_is_unwaited() return errors	2023-10-18 14:49:40 +02:00
Lennart Poettering	6774be4206	process-util: add pidref_is_my_child()	2023-10-18 14:49:40 +02:00
Lennart Poettering	becdfcb9f1	process-util: change pid_is_alive() to not eat up errors, and add pidref_is_alive() Let's no eat up errors, but propagate unexpected ones.	2023-10-18 14:40:25 +02:00
Lennart Poettering	8b51341545	process-util: add pidref_get_uid() and rename get_process_uid() → pidref_get_uid()	2023-10-18 14:39:33 +02:00
Lennart Poettering	d7d748548b	process-util: add pidref_get_comm() and rename get_process_comm() to pid_get_comm()	2023-10-18 14:39:33 +02:00
Lennart Poettering	fc87713bed	process-util: add pidref_is_kernel_thread()	2023-10-18 14:39:33 +02:00
Lennart Poettering	a034620f1a	process-util: add pidref_get_cmdline()	2023-10-18 14:39:33 +02:00
Lennart Poettering	0ff6ff2b29	tree-wide: port various parsers over to read_stripped_line()	2023-10-17 14:36:54 +02:00
Lennart Poettering	cde8cc946b	Merge pull request #29272 from enr0n/coredump-container coredump: support forwarding coredumps to containers	2023-10-16 16:13:16 +02:00
Nick Rosbrook	ade39d9ab8	process-util: introduce namespace_get_leader helper For a given PID and namespace type, this helper function gives the PID of the leader of the namespace containing the given PID. Use this in systemd-coredump instead of using the existing get_mount_namespace_leader. This helper will be used again in a later commit.	2023-10-13 15:13:11 -04:00
Luca Boccassi	6ecdfe7d10	process-util: add posix_spawn helper This provides CLONE_VM + CLONE_VFORK semantics, so it is useful to avoid CoW traps and other issues around doing work between fork() and exec().	2023-10-12 13:37:22 +01:00
Joerg Behrmann	7c52d5236a	treewide: split commandline into command line	2023-09-20 16:37:23 +01:00
OMOJOLA JOSHUA	ad5db9404e	Journal: Add message IDs for emergency-level log messages	2023-09-01 13:59:21 +01:00
Luca Boccassi	840ac5cd1a	process-util: use clone2 on ia64 glibc does not provide clone() on ia64, only clone2. But only as a symbol in the shared library, there's no prototype in the gblic headers, so we have to define it, copied from the manpage.	2023-07-10 11:39:35 +01:00
Frantisek Sumsal	5000cea8d2	tree-wide: explicitly ignore return value in a couple more places Resolves: - CID#1490777 - CID#1498366 - CID#1508639 - CID#1509084 - CID#1509086 - CID#1509087	2023-07-02 12:22:45 +02:00
Lennart Poettering	8c3fe1b5b5	process-util: add simple wrapper around PR_SET_CHILD_SUBREAPER Let's a simple helper that knows how to deal with PID == 1.	2023-06-23 10:05:16 +02:00
Lennart Poettering	2e7b105eb9	process-util: add FORK_DETACH flag for forking of detached child A test for this is later added indirectly, via aynchronous_rm_rf() that uses this and comes with a suitable test.	2023-06-23 10:02:15 +02:00
Lennart Poettering	29c3520f28	process-util: add clone_with_nested_stack() helper This wraps glibc's clone() but deals with the 'stack' parameter in a sensible way. Only supports invocations without CLONE_VM, i.e. when child is a CoW copy of parent.	2023-06-23 10:00:30 +02:00
Lennart Poettering	09f9530baf	process-util: add helper that detects if we are a reaper process	2023-06-23 10:00:30 +02:00
Lennart Poettering	563e684689	stat-util: rename files_same() → inode_same() Let's be more accurate about what this function does: it checks whether the underlying reported inode is the same. Internally, this already uses a better named stat_inode_same() call, hence let's similarly name the wrapping function following the same logic. Similar for files_same_at() and path_equal_or_same_files(). No code changes, just some renaming.	2023-05-19 17:42:41 +02:00

1 2 3 4 5 ...

274 Commits