Let's hook up one more thing with credentials: the machine ID to use
when none is initialized yet.
This requires some reordering of initialization steps in PID 1: we need
to import credentials first, and only then initialize the machine ID.
This is just like read_credential() but also looks into the encrypted
credential directory, not just the regular one.
Normally, we decrypt credentials at the moment we pass them to services.
From service PoV all credentials are hence decrypted credentials.
However, when we want to access credentials in a generator this logic
does not apply: here we have the regular and the encrypted credentials
directory. So far we didn't attempt to make use of credentials in
generators hence.
Let's address and add helper that looks into both directories, and talks
to the TPM if necessary to decrypt the credentials.
When the credential dir is backed by an fs that supports ACLs we must be
more careful with adjusting the 'x' bit of the directory, as any chmod()
call on the dir will reset the mask entry of the ACL entirely which we
don't want. Hence, do a manual set of ACL changes, that only add/drop
the 'x' bit but otherwise leave the ACL as it is.
This matters if we use tmpfs rather than ramfs to store credentials.
Certain cards support to set their eswitch to switchdev mode. In this
mode for each created VF there is also created so called VF representor.
This representor is helper network interface used for configuration of
mentioned eswitch and belongs to an appropriate PF.
VF representors are identified by the specific value of phys_port_name
attribute and the value has format "pfMvfN" where M is PF function
number and N is VF number inside this PF.
As the VF representor interfaces belong to PF PCI device the naming
scheme used for them is the same like for other PCI devices. In this
case name of PF interface is used and phys_port_name suffix is appended.
E.g.
PF=enp65s0f0np0 # phys_port_name for PF interface is 'p0'
VF=enp65s0f0np0v0 # v0 is appended for VF0 in case of NAMING_SR_IOV_V
REP=enp65s0f0np0pf0vf0 # phys_port_name for VF0 representor is 'pf0vf0'
First as the phys_port_name for representors is long (6+ chars) then the
generated name does not fit into IFNAMSIZ so this name is used only as
alternate interface name and for the primary one is used generic one
like eth<N>. Second 'f0' and 'pf0' in REP name is redundant.
This patch fixes this issue by introducing another naming scheme for VF
representors and appending 'rN' suffix to PF interface name for them.
N is VF number so the name used for representor interface is similar to
VF interface and differs only by the suffix.
For the example above we get:
PF=enp65s0f0np0
VF=enp65s0f0np0v0
REP=enp65s0f0np0r0
This eases for userspace to determine which representor interface
represents particular VF.
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Fine-tune the decoding of mount options in mount_verbose_full() to
provide more helpful log output:
1. decode changing of propagation changes
2. discern changing of superblock flags/mount option string from mount
flags
3. don't check secondary fields when deciding which mount op is
executed, only the flags decide that.
It just gives names for things generally just handled as numeric
indexes, hence drop the type name, and make the enum anonymous. Nothing
is using the type name anyway.
There's no need to conditionalize this.
Setting resume_offset=0 doesn't harm, and can even help
by overriding potentially existing half-written settings.
So far we relied on tmpfiles.d to copy tpm2-pcr-signature.json from
/.extra/ into /run/systemd/. This is racy however if cryptsetup runs too
early, and we cannot unconditionally run it after tmpfiles completed.
hence, let's teach cryptsetup to directly look for the file in /.extra/,
in order to simplify this, and remove the race. But do so only in the
initrd (as only there /.extra/ is a concept).
We generally prefer looking in /run/systemd/, since things are under
user control then. In the regular system we exclusively want that
userspace looks there.
Fixes: #26490
sometimes its useful to only run a specific test (or multiple) instead
of all implemented in a test. Allow the test name(s) to be specified on the
in a $TESTFUNCS env var, separated by colons.
Let's restrict how we apply credential globbing in ImportCredential=, so
that we have some flexibility in automatically extending the glob
expression with per-instance data eventually without getting into
conflict with the globbing parts.
In our current uses we only allow globbing at the end of the expression,
and this is a new, unreleased feature hence let's be restrictive on this
initially. We can still relax this later if we feel the need to after
all.
Fixes: #28022
We use usec_t for storing time value, which is 64bit.
However, usleep() takes useconds_t that is (typically?) 32bit.
Also, usleep() may only support the range [0, 1000000].
This introduce usleep_safe() which takes usec_t.
I figure these messages are rather unnecessary, so let the user quiet
them with the existing --quiet flag if desired. Makes systemd-analyze
condition a little more ergonomic in scripts.
Indeed when iterating over all the PT_LOAD segment of the core dump
while trying to look for the elf headers of a given module, we iterate
over them all and try to use the first one for which we can parse a
package metadata, but the start address is never taken into account,
so absolutely nothing guarantees we actually parse the right ELF header
of the right module we are currently iterating on.
This was tested like this:
- Create a core dump using sleep on a fedora 37 container, with an
explicit LD_PRELOAD of a library having a valid package metadata:
podman run -t -i --rm -v $(pwd):$(pwd) -w $(pwd) fedora:37 bash -x -c \
'LD_PRELOAD=libreadline.so.8 sleep 1000 & SLEEP_PID="$!" && sleep 1 && kill -11 "${SLEEP_PID}" && mv "core.${SLEEP_PID}" the-core'
- Then from a fedora 38 container with systemd installed, the resulting
core dump has been passed to systemd-coredump with and without this
patch. Without this patch, we get:
Module /usr/bin/sleep from rpm bash-5.2.15-3.fc38.x86_64
Module /usr/lib64/libtinfo.so.6.3 from rpm coreutils-9.1-8.fc37.x86_64
Module /usr/lib64/libc.so.6 from rpm coreutils-9.1-8.fc37.x86_64
Module /usr/lib64/libreadline.so.8.2 from rpm coreutils-9.1-8.fc37.x86_64
Module /usr/lib64/ld-linux-x86-64.so.2 from rpm coreutils-9.1-8.fc37.x86_64
While with this patch we get:
Module /usr/bin/sleep from rpm bash-5.2.15-3.fc38.x86_64
Module /usr/lib64/libtinfo.so.6.3 from rpm ncurses-6.3-5.20220501.fc37.x86_64
Module /usr/lib64/libreadline.so.8.2 from rpm readline-8.2-2.fc37.x86_64
So the parsed package metadata reported by systemd-coredump when the module
files are not found on the host (ie the case of crash inside a container) are
now correct. The inconsistency of the first module in the above example
(sleep is indeed not provided by the bash package) can be ignored as it
is a consequence of how this was tested.
In addition to this, this also fixes the performance issue of
systemd-coredump in case of the crashing process uses a large number of
shared libraries and having no package metadata, as reported in
https://sourceware.org/pipermail/elfutils-devel/2023q2/006225.html.
This setting allows services to run in an ephemeral copy of the root
directory or root image. To make sure the ephemeral copies are always
cleaned up, we add a tmpfiles snippet to unconditionally clean up
/var/lib/systemd/ephemeral. To prevent in use ephemeral copies from
being cleaned up by tmpfiles, we use the newly added COPY_LOCK_BSD
and BTRFS_SNAPSHOT_LOCK_BSD flags to take a BSD lock on the ephemeral
copies which instruct tmpfiles to not touch those ephemeral copies as
long as the BSD lock is held.
The file long ceased to be exclusively about configuration of the sleep
operation. It contains many many calls for other purposes, hence give it
a more generic name.
We need this in a single function only, hence move it there, and make it
a static field so that it has local scope.
While we are at it, rename s/readsize to buf/bufsize, to make
relationship clear. In particular as the data read is actually binary
and "s" hence a misnomer, since it suggests it was a string.
When making ephemeral snapshots of subvolumes whose cleanup depends on
whether they're locked or not, it's necessary to have the lock from the
very beginning, so let's support that with a new BTRFS_SNAPSHOT_LOCK_BSD
flag.
This has been badly named given the path doesn't refer to a device quite
likely, but to a path to a regular file. Hence let's be more precise
with naming.
(.device kinda suggests this was an sd_device object of sorts, but it
really isn't.)
We usually call dev_t entities "devnum" or "devno". That's redundant
enough, let's not call this "device_id". In particular as that's
something else (in udev context).