Commit Graph

30 Commits

Author SHA1 Message Date
Yu Watanabe
543a48b653 libc-wrapper: introduce a tiny libc wrapper
Then, move syscall definitions to the wrapper, and prototypes are moved
to relevant headers.

This also adds checks for add_key() and request_key(), as one day
glibc may be going to add some of them separatedly.

The check for fspick in meson.build is dropped, as it is currently
unused in our code.

This also moves
- basic/missing_bpf.h -> include/override/linux/bpf.h,
- basic/missing_keyctl.h -> include/override/linux/keyctl.h.
2025-07-11 13:05:46 +09:00
Yu Watanabe
b6278c1937 bpf-program: check if a trivial BPF program can be created and loaded
Re-introduce the check dropped by
ec3c5cfac7,
ad446c8ceb.

For some reasons, if we are running on LXC, even if bpf_program_supported()
returned true, but bpf_program_load_kernel() failed:
```
Attaching device control BPF program to cgroup /system.slice/test-bpf-devices-875b406d56ac7bc3.scope/186c411f6e991777 failed: Operation not permitted
src/test/test-bpf-devices.c:31: Assertion failed: Expected "r" to succeed, but got error: Operation not permitted
```
2025-05-27 17:24:33 +01:00
Daan De Meyer
69a283c5f2 shared: Clean up includes
Split out of #37344.
2025-05-24 14:00:44 +02:00
Yu Watanabe
71be1f3875 bpf-program: introduce bpf_program_supported() helper function
It checks if the kernel is built with CONFIG_CGROUP_BPF.
It is currently unused, but will be used later.
2025-05-10 00:17:52 +09:00
Daan De Meyer
c94f6ab1bf string-table: Move more implementation logic into functions
Let's move some more implementation logic into functions. We keep
the logic that requires the macro in the macro and move the rest into
functions.

While we're at it, let's also make the parameter declarations of
all the string table macros less clausthrophobic.
2025-05-06 10:14:24 +02:00
Yu Watanabe
f769518c9a tree-wide: drop doubled empty lines 2024-10-07 09:51:37 +02:00
nl6720
934288757c tree-wide: link to docs.kernel.org for kernel documentation
https://www.kernel.org/ links to https://docs.kernel.org/ for the documentation.
These URLs are shorter and nicer looking.
2024-01-22 10:50:33 +00:00
Yu Watanabe
d2132d3d8d parse-util: make parse_fd() return -EBADF
The previous error code -ERANGE is slightly ambiguous, and use more
specific one. This also drops unnecessary error handlings.

Follow-up for 754d8b9c33 and
e652663a04.
2023-05-08 09:49:55 +02:00
David Tardon
e652663a04 tree-wide: use parse_fd() 2023-05-05 09:10:56 +02:00
Frantisek Sumsal
740831076c shared: reject empty attachment path 2023-05-03 10:09:53 +02:00
Dominique Martinet
25d9c6cdaf bpf-firewall: give a name to maps used
Running systemd with IP accounting enabled generates many bpf maps (two
per unit for accounting, another two if IPAddressAllow/Deny are used).

Systemd itself knows which maps belong to what unit and commands like
`systemctl status <unit>` can be used to query what service has which
map, but monitoring these values all the time costs 4 dbus requests
(calling the .IP{E,I}gress{Bytes,Packets} method for each unit) and
makes services like the prometheus systemd_exporter[1] somewhat slow
when doing that for every units, while less precise information could
quickly be obtained by looking directly at the maps.

Unfortunately, bpf map names are rather limited:
- only 15 characters in length (16, but last byte must be 0)
- only allows isalnum(), _ and . characters

If it wasn't for the length limit we could use the normal unit escape
functions but I've opted to just make any forbidden character into
underscores for maximum brievty -- the map prefix is also rather short:
This isn't meant as a precise mapping, but as a hint for admins who want
to look at these.

(Note there is no problem if multiple maps have the same name)

Link: https://github.com/povilasv/systemd_exporter [1]
2023-04-18 08:23:55 +09:00
Zbigniew Jędrzejewski-Szmek
254d1313ae tree-wide: use -EBADF for fd initialization
-1 was used everywhere, but -EBADF or -EBADFD started being used in various
places. Let's make things consistent in the new style.

Note that there are two candidates:
EBADF 9 Bad file descriptor
EBADFD 77 File descriptor in bad state

Since we're initializating the fd, we're just assigning a value that means
"no fd yet", so it's just a bad file descriptor, and the first errno fits
better. If instead we had a valid file descriptor that became invalid because
of some operation or state change, the other errno would fit better.

In some places, initialization is dropped if unnecessary.
2022-12-19 15:00:57 +01:00
Julia Kartseva
8fe9dbb926 bpf: name unnamed bpf programs
bpf-firewall and bpf-devices do not have names. This complicates
debugging with bpftool(8).

Assign names starting with 'sd_' prefix:
* firewall program names are 'sd_fw_ingress' for ingress attach
point and 'sd_fw_egress' for egress.
* 'sd_devices' for devices prog

'sd_' prefix is already used in source-compiled programs, e.g.
sd_restrictif_i, sd_restrictif_e, sd_bind6.

The name must not be longer than 15 characters or BPF_OBJ_NAME_LEN - 1.

Assign names only to programs loaded to kernel by systemd since
programs pinned to bpffs are already loaded.
2022-01-22 16:48:42 +09:00
Lennart Poettering
7c248223eb tree-wide: use new RET_NERRNO() helper at various places 2021-11-16 08:04:09 +01:00
alexlzhu
76dc17254f core: remove refcount for bpf program
Currently ref count of bpf-program is kept in user space. However, the
kernel already implements its own ref count. Thus the ref count we keep for
bpf-program is redundant.

This PR removes ref count for bpf program as part of a task to simplify
bpf-program and remove redundancies, which will make the switch to
code-compiled BPF programs easier.

Part of #19270
2021-10-12 12:48:23 +02:00
Zbigniew Jędrzejewski-Szmek
e437538f35 tree-wide: make cunescape*() functions return ssize_t
Strictly speaking, we are returning the size of a memory chunk of
arbitrary size, so ssize_t is more appropriate than int.
2021-07-09 15:07:40 +02:00
Lennart Poettering
b57d752326 bpf-program: serialize attached BPF programs across daemon reexec/reload
Alternative to #17495
2021-06-08 22:02:35 +02:00
Lennart Poettering
7a7cf83dc3 bpf-program: export hash_ops for BPFProgam objects 2021-06-08 22:02:35 +02:00
Lennart Poettering
06ad9d0c12 bpf-program: use structured initialization when allocating BPFProgram objects 2021-06-08 22:02:35 +02:00
Lennart Poettering
319a4f4bc4 alloc-util: simplify GREEDY_REALLOC() logic by relying on malloc_usable_size()
We recently started making more use of malloc_usable_size() and rely on
it (see the string_erase() story). Given that we don't really support
sytems where malloc_usable_size() cannot be trusted beyond statistics
anyway, let's go fully in and rework GREEDY_REALLOC() on top of it:
instead of passing around and maintaining the currenly allocated size
everywhere, let's just derive it automatically from
malloc_usable_size().

I am mostly after this for the simplicity this brings. It also brings
minor efficiency improvements I guess, but things become so much nicer
to look at if we can avoid these allocation size variables everywhere.

Note that the malloc_usable_size() man page says relying on it wasn't
"good programming practice", but I think it does this for reasons that
don't apply here: the greedy realloc logic specifically doesn't rely on
the returned extra size, beyond the fact that it is equal or larger than
what was requested.

(This commit was supposed to be a quick patch btw, but apparently we use
the greedy realloc stuff quite a bit across the codebase, so this ends
up touching *a*lot* of code.)
2021-05-19 16:42:37 +02:00
Julia Kartseva
9984f4933b shared: bpf_attach_type {from,to} string
Introduce bpf_cgroup_attach_type_table with accustomed attached type
names also used in bpftool.
Add bpf_cgroup_attach_type_{from|to}_string helpers to convert from|to
string representation of pinned bpf program, e.g.
"egress:/sys/fs/bpf/egress-hook" for
/sys/fs/bpf/egress-hook path and BPF_CGROUP_INET_EGRESS attach type.
2021-04-09 20:28:47 -07:00
Julia Kartseva
f23f0ead1f shared: add bpf-program helpers
Add helpers to:
- Create new BPFProgram instance from a path in bpf
filesystem and bpf attach type;
- Pin a program to bpf fs;
- Get BPF program ID by BPF program FD.
2021-04-09 20:28:47 -07:00
Luca Boccassi
9ca600e2bf bpf: do not use structured initialization for bpf_attr
It looks like zero'ing the struct is not enough, and with some level
of optimizations there is still non-zero padding left over.
Switch to member-by-member initialization. Also convert all remaining
bpf_attr variables in other files.
2021-01-10 21:16:38 +00:00
Luca Boccassi
28abf5ad34 bpf: zero bpf_attr before initialization
When building with Clang and using structured initialization, the
bpf_attr union is not zero-padded, so the kernel misdetects it as
an unsupported extension.
zero it until Clang's behaviour matches GCC. Do not skip the test
on Github Actions anymore.
2021-01-09 17:35:38 +01:00
Yu Watanabe
ca39a3cef9 bpf: do not call log_oom() in library function 2020-11-13 19:30:57 +09:00
Yu Watanabe
db9ecf0501 license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
Yu Watanabe
f5947a5e92 tree-wide: drop missing.h 2019-10-31 17:57:03 +09:00
Kai Lüke
fab347489f bpf-firewall: custom BPF programs through IP(Ingress|Egress)FilterPath=
Takes a single /sys/fs/bpf/pinned_prog string as argument, but may be
specified multiple times. An empty assignment resets all previous filters.

Closes https://github.com/systemd/systemd/issues/10227
2019-06-25 09:56:16 +02:00
Lennart Poettering
0a9707187b util: split out memcmp()/memset() related calls into memory-util.[ch]
Just some source rearranging.
2019-03-13 12:16:43 +01:00
Zbigniew Jędrzejewski-Szmek
d284b82b3e Move various files that don't need to be in basic/ to shared/
This doesn't have much effect on the final build, because we link libbasic.a
into libsystemd-shared.so, so in the end, all the object built from basic/
end up in libsystemd-shared. And when the static library is linked into binaries,
any objects that are included in it but are not used are trimmed. Hence, the
size of output artifacts doesn't change:

$ du -sb /var/tmp/inst*
54181861	/var/tmp/inst1    (old)
54207441	/var/tmp/inst1s   (old split-usr)
54182477	/var/tmp/inst2    (new)
54208041	/var/tmp/inst2s   (new split-usr)

(The negligible change in size is because libsystemd-shared.so is bigger
by a few hundred bytes. I guess it's because symbols are named differently
or something like that.)

The effect is on the build process, in particular partial builds. This change
effectively moves the requirements on some build steps toward the leaves of the
dependency tree. Two effects:
- when building items that do not depend on libsystemd-shared, we
  build less stuff for libbasic.a (which wouldn't be used anyway,
  so it's a net win).
- when building items that do depend on libshared, we reduce libbasic.a as a
  synchronization point, possibly allowing better parallelism.

Method:
1. copy list of .h files from src/basic/meson.build to /tmp/basic
2. $ for i in $(grep '.h$' /tmp/basic); do echo $i; git --no-pager grep "include \"$i\"" src/basic/ 'src/lib*' 'src/nss-*' 'src/journal/sd-journal.c' |grep -v "${i%.h}.c";echo ;done | less
2018-11-20 07:27:37 +01:00