Commit Graph

6670 Commits

Author SHA1 Message Date
Lennart Poettering
3e32e5a4d2 log: raise log level to LOG_DEBUG if $DEBUG_INVOCATION=1 is set
Let's implement our own protocols, and raise the log level to debug if
DEBUG_INVOCATION=1 is set.

Follow-up for: 7d8bbfbe08
2024-12-13 19:53:55 +01:00
Yu Watanabe
dcd333168e sysctl-util: support AF_MPLS
To support writing/reading e.g. /proc/sys/net/mpls/conf/eth0/input .
2024-12-13 15:36:45 +00:00
Daan De Meyer
1c43f92a2a basic/fileio: two modernizations (#35559) 2024-12-13 11:49:12 +00:00
Yu Watanabe
2a92e0bc6c string-table: make DUMP_STRING_TABLE() returns 0
Then, we can use it as
===
  return DUMP_STRING_TABLE(...);
===
2024-12-12 15:21:16 +09:00
Yu Watanabe
7e438055a6 pretty-print: don't use OSC 8 for incompatible URLs (#35223) 2024-12-12 05:43:36 +09:00
Mike Yuan
eded4272d2 cgroup-util: introduce cg_get_cgroupid_at()
Suggested in https://github.com/systemd/systemd/pull/35242#discussion_r1862658163
2024-12-12 05:19:07 +09:00
Yu Watanabe
ab5de638e9 process-util: modernize is_main_thread(); make sure get_process_ppid() won't return ppid == 0 (#35561)
Split out from #35242
2024-12-12 05:16:04 +09:00
Lennart Poettering
9948b4668c virt: drop userns detection heuristic
Now that we have an explicit userns check we can drop the heuristic for
it, given that it's kinda wrong (because mapping the full host UID range
into a userns is actually a thing people do).

Hence, just delete the code and only keep the userns inode check in
place.
2024-12-11 19:23:03 +01:00
Lennart Poettering
7f0a615ef8 virt: dont check for cgroupns anymore
Now that we have a reliable pidns check I don't think we really should
look for cgroupns anymore, it's too weak a check. I mean, if I myself
would implement a desktop app sandbox (like flatpak) I'd always enable
cgroupns, simply to hide the host cgroup hierarchy.

Hence drop the check.

I suggested adding this 4 years ago here:

https://github.com/systemd/systemd/pull/17902#issuecomment-745548306
2024-12-11 19:23:03 +01:00
Mike Yuan
ad9a66fee8 basic/fileio: clean up executable_is_script() a bit
- Rename to script_get_shebang_interpreter and return
  -EMEDIUMTYPE if the executable is not a script.
  We nowadays utilize the scheme of making ret param
  of getters optional, and use them directly as checkers.
- Don't unnecessarily read the whole line, but check
  only the shebang first.
2024-12-11 19:11:22 +01:00
Mike Yuan
c0cf1c5826 basic/fileio: minor modernization for xopendirat() 2024-12-11 19:07:20 +01:00
Mike Yuan
61263e1436 process-util: make sure we don't report ppid == 0
Previously, if pid == 0 and we're PID 1, get_process_ppid()
would set ret to getppid(), i.e. 0, which is inconsistent
when pid is explicitly set to 1. Ensure we always handle
such case by returning -EADDRNOTAVAIL.
2024-12-11 14:44:08 +01:00
Mike Yuan
07612aab66 process-util: use our usual tristate semantics for is_main_thread()
While at it, _unlikely_ is dropped, as requested in
https://github.com/systemd/systemd/pull/35242#discussion_r1880096233
2024-12-11 14:44:07 +01:00
Mike Yuan
e38a70a19f basic/user-util: modernize getgroups_alloc() a bit (#35226)
Split out from #35219 for inclusion in v258
2024-12-11 13:50:50 +01:00
Lennart Poettering
0823d96a0b pretty-print: don't use OSC 8 for incompatible URLs 2024-12-11 10:35:03 +01:00
Lennart Poettering
f79562aaee string-util: split out EOT check in strip_tab_ansi()
Let's unify the eot check in one place in order to make things more
readable.
2024-12-11 10:35:03 +01:00
Mike Yuan
f0e8db76ca basic/user-util: modernize getgroups_alloc() a bit
- Make sure ret is initialized if we return >= 0
- Reduce variable scope
2024-12-10 20:51:14 +01:00
Mike Yuan
8112df6bef basic/user-util: use FOREACH_ARRAY at one more place 2024-12-10 20:51:14 +01:00
Mike Yuan
5dfccccce9 basic/time-util: modernize parse_time() a bit 2024-12-10 20:50:36 +01:00
Yu Watanabe
896b53ef4e basic: update syscall tables 2024-12-10 11:15:48 +09:00
Zbigniew Jędrzejewski-Szmek
22996a3393 basic/namespace-util: fix double logging after fork failure
[   10.056930] (journald)[104]: Failed to fork off '(sd-mkuserns)': Invalid argument
[   10.063727] systemd[1]: systemd-modules-load.service: About to execute: /usr/lib/systemd/systemd-modules-load
[   10.071148] (journald)[104]: Failed to fork process (sd-mkuserns): Invalid argument

safe_fork_full() already logs at debug level, so the caller shouldn't.
2024-12-02 11:51:23 +01:00
Zbigniew Jędrzejewski-Szmek
afb368951c pid1: assume user namespaces are unavailable if we get -EINVAL from clone()
As reported in https://github.com/systemd/systemd/issues/35400,
on riscv64, with Linux version 6.6.51-linux4microchip+fpga-2024.09, we get:

[   10.063727] systemd[1]: systemd-modules-load.service: About to execute: /usr/lib/systemd/systemd-modules-load
[   10.071148] (journald)[104]: Failed to fork process (sd-mkuserns): Invalid argument

Fixes https://github.com/systemd/systemd/issues/35400.

'r' is used to make the repeated checks shorter. Without that, the long variable
name is distracting.
2024-12-02 11:30:06 +01:00
Mike Yuan
4da9f38de1 cgroup-util: use RET_NERRNO where appropriate 2024-11-27 18:38:00 +01:00
Mike Yuan
7a719510c8 basic/fileio: minor coding style cleanup
Follow-up for bbec1c87d3
2024-11-27 14:33:23 +01:00
gerblesh
bbec1c87d3 sysext: set SELinux context for hierarchies and workdir 2024-11-26 17:47:32 +00:00
Zbigniew Jędrzejewski-Szmek
d293fade24 Check inode number to see if we are in init namespace (#35306)
This is a more comprehensive fix compared to #35273. Also adds a minimal
test only.

Based on Luca's #35273 but generalizes the code a bit.

In v258 we really should get rid of the old heuristics around userns and
cgroupns detection, but given we are late in the v257 cycle this keeps
them in.
2024-11-25 14:13:36 +01:00
Yu Watanabe
56c761f8c6 namespace-util: handle -ENOSPC by userns_acquire() gracefully in is_idmapping_supported() (#35313)
Follow-up for edae62120f.
Fixes #35311.
2024-11-23 17:32:23 +09:00
Yu Watanabe
3dda236c5c basic/linux: update kernel headers from v6.12 2024-11-23 17:31:12 +09:00
Lennart Poettering
a2429f507c virt: make use of ns inode check in running_in_userns() and running_in_cgroupns() too 2024-11-23 00:14:20 +01:00
Luca Boccassi
193bf42ab0 detect-virt: check the inode number of the pid namespace
The indoe number of root pid namespace is hardcoded in the kernel to
0xEFFFFFFC since 3.8, so check the inode number of our pid namespace
if all else fails. If it's not 0xEFFFFFFC then we are in a pid
namespace, hence a container environment.

Fixes https://github.com/systemd/systemd/issues/35249

[Reworked by Lennart, to make use of namespace_is_init()]
2024-11-23 00:14:20 +01:00
Lennart Poettering
18ead2b03d namespace-util: add generic namespace_is_init() call 2024-11-23 00:14:20 +01:00
Yu Watanabe
2994ca354b namespace-util: update log messages 2024-11-23 06:52:48 +09:00
Yu Watanabe
eb14b993bb namespace-util: handle -ENOSPC by userns_acquire() gracefully in is_idmapping_supported()
Follow-up for edae62120f.
Fixes #35311.
2024-11-23 06:52:38 +09:00
Luca Boccassi
b7eefa1996 cgroup-util: fix memory leak on error
CID#1565824

Follow-up for f6793bbcf0
2024-11-21 14:02:34 +09:00
Lennart Poettering
f6793bbcf0 killall: gracefully handle processes inserted into containers via nsenter -a
"nsenter -a" doesn't migrate the specified process into the target
cgroup (it really should). Thus the cgroup will remain in a cgroup
that is (due to cgroup ns) outside our visibility. The kernel will
report the cgroup path of such cgroups as starting with "/../". Detect
that and print a reasonably error message instead of trying to resolve
that.
2024-11-20 18:11:38 +00:00
Mike Yuan
f87863a8ff process-util: refuse to operate on remote PidRef
Follow-up for 7e3e540b88
2024-11-20 18:10:26 +00:00
Mike Yuan
eea9d3eb10 basic/user-util: split out placeholder suppression from USER_CREDS_CLEAN into its own flag
No functional change, preparation for later commits.
2024-11-19 00:38:18 +01:00
Mike Yuan
579ce77ead basic/user-util: introduce shell_is_placeholder() helper 2024-11-19 00:38:18 +01:00
Mike Yuan
c8590ad60d process-util: refuse FORK_DETACH + FORK_DEATHSIG_*
There's no synchoronization between the intermediate process
and the double-forked child, and the semantics are not useful.
Refuse such combination.
2024-11-14 12:22:15 +00:00
Lennart Poettering
9466fe014f namespace-util: pin pid via pidfd during namespace_open() 2024-11-13 14:18:05 +00:00
Yu Watanabe
d762b14e38 audit-util: return -ENODATA from audit_{session|loginuid}_from_pid() if invoked in a container (#35072)
The auditing subsystem is still not virtualized for containers, hence
the two values don't really make sense inside them, they will just leak
information from outside into the container. Hence don't make use of the
data if we detect we are run inside of a container.

This has visible effects: logind will no longer try to reuse the
auditing session ids as its own session ids when run inside a container.

While are at it, modernize the calls in more ways:

1. switch to pidref behaviour, all but one of our uses are using pidref
anyway already.
2. use read_virtual_file() + proc_mounted()
3. reasonably distinguish ENOENT errors when reading the process proc
files: distinguish the case where /proc is not mounted, from the case
where the process is already gone, from where auditing is not enabled in
the kernel build.
2024-11-13 10:08:29 +09:00
Lennart Poettering
c892816ceb run0: when changing privileges to non-root, do not show superhero emoji
Let's show an idcard logo instead, to indicate that we changed ids.
2024-11-12 23:09:21 +01:00
Lennart Poettering
7bf0149e9b process-util: more gracefully handle oom adjust parsing/setting
Who knows what kind of mount shenanigans people employ, let's gracefully
handle parse failures of proc files, like we alway do otherwsie.
2024-11-12 23:03:40 +01:00
Lennart Poettering
68c554f23a audit-util: modernize use_audit() a bit
Use ERRNO_IS_xyz() macros where appropriate.

Also, reduce indentation a bit by inverted early check.

And log in more error codepaths.
2024-11-12 23:03:40 +01:00
Lennart Poettering
7e02ee98d8 audit-util: return -ENODATA from audit_{session|loginuid}_from_pid() if invoked in a container
The auditing subsystem is still not virtualized for containers, hence the two
values don't really make sense inside them, they will just leak
information from outside into the container. Hence don't make use of the
data if we detect we are run inside of a container.

This has visible effects: logind will no longer try to reuse the
auditing session ids as its own session ids when run inside a container.

While are at it, modernize the calls in more ways:

1. switch to pidref behaviour, all but one of our uses are using pidref
   anyway already.
2. use read_virtual_file() + proc_mounted()
3. reasonable distinguish ENOENT errors when reading the process proc
   files: distinguish the case where /proc is not mounted, from the case
   where the process is already gone, from where auditing is not enabled
   in the kernel build.
2024-11-12 23:03:03 +01:00
Lennart Poettering
56933f2073 uid-classification: properly classify *all* container UIDs
A bit confusingly CONTAINER_UID_BASE_MAX is just the maximum *base* UID
for a container. Thus, with the usual 64K UID assignments, the last
actual container UID is CONTAINER_UID_BASE_MAX+0xFFFF.

To make this less confusing define CONTAINER_UID_MIN/MAX that add the
missing extra space.

Also adjust two uses where this was mishandled so far, due to this
confusion.

With this change the UID ranges we default to should properly match what
is documented on https://systemd.io/UIDS-GIDS/.
2024-11-08 23:18:39 +00:00
Lennart Poettering
af3baf174a fs-util: add comment about XO_NOCOW 2024-11-08 09:21:25 +01:00
Ivan Kruglov
a567de392d process-util: introduce report_errno_and_exit() as part of src/basic/process-util.{h,c} 2024-11-06 11:18:38 +01:00
Andres Beltran
f348831d27 namespace-util: make idmapping not supported if syscalls return EPERM 2024-11-06 09:27:33 +01:00
Zbigniew Jędrzejewski-Szmek
2257be13fe tree-wide: time-out → timeout
For justification, see 3f9a0a522f.
2024-11-05 19:32:19 +00:00