When adding a new interface to the object add it at the end of the list.
This way, when iterating over the list, e.g., during handling introspect
call, the order of returned interfaces will mach the order in which they
were added.
If mode= is not set in rootflags= add mode=0755 when a tmpfs
is used on the rootfs, otherwise it will be group/world writable
as that's the default mode for tmpfs filesystems.
Follow-up for 725ad3b062
Previously, the property was checked only when an uevent is received,
so even if an interface has ID_NET_MANAGED_BY property, the interface
will be configured by networkd when reconfiguration is triggered e.g.
when interface state is changed.
Follow-up for ba87a61d05.
Fixes#36997.
Not sure why the test failed, but maybe the test environment is too
slow? Even this does not fix the failure, by enabling debugging logs,
this hopefully provides more useful information for debugging.
For issue #37685.
Currently, if we use a cgroup namespace together with DelegateSubgroup=,
the subgroup becomes the root of the cgroup namespace because we move the
service process to the subgroup before we unshare the cgroup namespace, and
the current cgroup becomes the root of the cgroup namespace when we unshare
the cgroup namespace.
Let's fix the problem by not moving the service process to the subgroup until
we've unshared the cgroup namespace. Note that this doesn't break the primary use
case of CLONE_INTO_CGROUP since we still use it to immediately clone into the service
main cgroup, just not anymore into the subgroup, but this shouldn't matter in practice.
Additionally, we need special handling for control processes, as those *do*
need to get spawned into the subcgroup immediately if delegation is configured to
avoid violating the cgroupsv2 "no inner processes" rule.
Effectively, this leaves us with the following logic:
- In exec_spawn(), spawn into subgroup if we're spawning a control process
that needs to be spawned into a subgroup immediately. Otherwise, spawn into
main service cgroup.
- In exec_invoke(), move into subgroup early if we don't need a cgroup namespace.
Otherwise, move into subgroup after we've unshared the cgroup namespace.
systemd-pcrlock requires support for the PolicyAuthorizeNV command,
which is not implemented in the first TPM2 releases. We also strictly
require SHA-256 support. Hence add a tool for checking for both of
these.
This is a tighter version of "systemd-analyze has-tpm2", that checks for
the precise feature that systemd-pcrlock needs, on top of basic TPM2
functionality.
Fixes: #37607
A new core was added to the test, but the loop counter was not increased
to wait for it, so the test races against systemd-coredump's processing.
This failed at least once in debci:
8015s [ 32.227813] TEST-87-AUX-UTILS-VM.sh[1038]: + coredumpctl info COREDUMP_TIMESTAMP=1679509902000000
8015s [ 32.228684] TEST-87-AUX-UTILS-VM.sh[1723]: No coredumps found.
Follow-up for 0c49e0049b
Fixes https://github.com/systemd/systemd/issues/37666
The kernel provides %d which is documented as
"dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE".
We already query /proc/pid/auxv for this information, but unfortunately this
check is subject to a race, because the crashed process may be replaced by an
attacker before we read this data, for example replacing a SUID process that
was killed by a signal with another process that is not SUID, tricking us into
making the coredump of the original process readable by the attacker.
With this patch, we effectively add one more check to the list of conditions
that need be satisfied if we are to make the coredump accessible to the user.
Reportedy-by: Qualys Security Advisory <qsa@qualys.com>
In principle, %d might return a value other than 0, 1, or 2 in the future.
Thus, we accept those, but emit a notice.
We have already exposed device ID in the output of device ID in J
fields. Also sd_device_get_device_id() and sd_device_new_from_device_id()
are already public. Hence, making udevadm accept device IDs may be
useful.
With this change, as we save several data in /run/udev with device ID,
we can call udevadm something like the following:
```
udevadm info $(ls /run/udev/tags/uaccess)
```
Then, we can show all devices that has uaccess tag.
Add endpoint for listing boots. Output format mimics `journalctl
--list-boots -o json`, so it's a plain array containing index, boot ID
and timestamps of the first and last entry. Initial implementation
returns boots ordered starting with the current one and doesn't allow
any filtering (i.e. equivalent of --lines argument).
Fixes: #37573
This fixes the following warning:
```
/tmp/systemd/test/test-network/systemd-networkd-tests.py:5107: SyntaxWarning: invalid escape sequence '\.'
self.assertRegex(output, 'inet 10\.234\.77\.111/32.*dummy98')
```
Follow-up for 6479204e56.
iproute2 v6.15 fixed some rounding errors in the reported stats:
https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=d947f365602b30657d1b797e7464000d0ab88d5a
so the current regex doesn't work anymore. Fix it to check for both
old and new values.
systemd-networkd-tests.py[523]: FAIL: test_qdisc_tbf (__main__.NetworkdTCTests.test_qdisc_tbf)
systemd-networkd-tests.py[523]: ----------------------------------------------------------------------
systemd-networkd-tests.py[523]: Traceback (most recent call last):
systemd-networkd-tests.py[523]: File "/usr/lib/systemd/tests/testdata/test-network/systemd-networkd-tests.py", line 5402, in test_qdisc_tbf
systemd-networkd-tests.py[523]: self.assertRegex(output, 'rate 1Gbit burst 5000b peakrate 100Gbit minburst 987500b lat 70(.0)?ms')
systemd-networkd-tests.py[523]: ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
systemd-networkd-tests.py[523]: AssertionError: Regex didn't match: 'rate 1Gbit burst 5000b peakrate 100Gbit minburst 987500b lat 70(.0)?ms' not found in 'qdisc tbf 35: root refcnt 2 rate 1Gbit burst 5000b peakrate 100Gbit minburst 999200b lat 70ms \nqdisc pfifo 37: parent 35: limit 100000p'
per Daan's explanation:
other subtests running as testuser apparently use systemd-run --user
--machine testuser@.host which turns user tracking in logind into "by
pin" mode. when the last pinning session exits it terminates the user.
test_keep_configuration_on_restart() works, but the error printed is
misleading because self.assertNotEmpty() doesn't exist.
Add a working assert statement so, when the unmanaged interface is
altered, the test fails with a meaningful error, like:
### ip monitor dev unmanaged0 BEGIN
222:33::/64 proto kernel metric 256 pref medium
FAIL
[...]
Traceback (most recent call last):
File "/work/src/test/test-network/systemd-networkd-tests.py", line 5085, in test_keep_configuration_on_restart
self.assertEqual(line, '')
AssertionError: '222:33::/64 proto kernel metric 256 pref medium' != ''
- 222:33::/64 proto kernel metric 256 pref medium
While at it, strip the trailing newline so we can print easily the
string (and in future build more a robust regexp which uses the $ token)
This was broken in f45b801551. Unfortunately
the review does not talk about backward compatibility at all. There are
two places where it matters:
- During upgrades, the replacement of kernel.core_pattern is asynchronous.
For example, during rpm upgrades, it would be updated a post-transaction
file trigger. In other scenarios, the update might only happen after
reboot. We have a potentially long window where the old pattern is in
place. We need to capture coredumps during upgrades too.
- With --backtrace. The interface of --backtrace, in hindsight, is not
great. But there are users of --backtrace which were written to use
a specific set of arguments, and we can't just break compatiblity.
One example is systemd-coredump-python, but there are also reports of
users using --backtrace to generate coredump logs.
Thus, we require the original set of args, and will use the additional args if
found.
A test is added to verify that --backtrace works with and without the optional
args.
`ExtensionImages=` and `ExtensionDirectories=` now let you specify
vpick-named extensions; however, since they just get set up once when
the service is started, you can't see newer versions without restarting
the service entirely. Here, also reload confext extensions when you
reload a service. This allows you to deploy a new version of some
configuration and have it picked up at reload time without interruption
to your workload.
Right now, we would only reload confext extensions and leave the sysext
ones behind, since it didn't seem prudent to swap out what is likely
program code at reload. This is made possible by only going for the
`SYSTEMD_CONFEXT_HIERARCHIES` overlays (which only contains `/etc`).
This PR:
- Adjusts `service.c` to also refresh extensions when needed.
- Adds integration tests to check that a confext reload actually
occurred.
- Adds to the `systemd.exec` man pages to document this behavior.
This is a follow up to #24864 and #31364. Thank you to @bluca and
@goenkam for help in getting this up.
In TEST-50-DISSECT.dissect, this adds the following cases:
- testservice-50g: vpick extension in ExtensionDirectories
- testservice-50h: vpick extension in ExtensionImages
- testservice-50i: ExtensionDirectories + RootImage
- testservice-50j: ExtensionDirectories + RootDirectory
This integration test demonstrates that a containerized systemd instance can
write to a bind mounted file observable to the host. Specifically, the bash
script uses systemd-run to start a systemd instance as a transient unit
container. This systemd-run command bind mounts a directory the container will
share with the host, and runs an internal service which creates and writes to a
file from the container's view of this directory. When finished writing, the
service runs the exit target, terminating the internal systemd instance, and
ending the lifetime of the container.
The script waits for the container to finish running, then verifies that the
expected file contents were written on the host side of the filesystem mount.
This test employs a workaround, creating an unmasked procfs mount on the host
which enables the privileged guest to create its own mounts internally. This
may indicate a systemd bug, as the privileged container should not rely on
the existence of an unmasked procfs on the host in order to mount its own
filesystems internally.
Our baseline is v5.4 and cgroup v2 is enforced now,
which means CPU accounting is cheap everywhere without
requiring any controller, hence just remove the directive.