Commit Graph

80 Commits

Author SHA1 Message Date
Lennart Poettering
7cb349f0ca loop-util: make clearer how LoopDevice objects that do not encapsulate an actual loopback device are set up 2022-09-01 22:06:19 +02:00
Lennart Poettering
4c1d50e65c loop-util: lock the control device around clearing the loopback device and deleting it
This mirrors what we already do during allocation. We lock the control
device first, and then release the block device and then delete it.

This makes things substantially more robust as long all participants do
such locking: we won't attempt to delete a block device somebody else
already is using.
2022-09-01 22:06:19 +02:00
Lennart Poettering
87862cc2b4 loop-util: close lock fd before trying LOOP_CLR_FD in failure path
If the loopback device is open more than once LOOP_CLR_FD will fail,
hence close the lock fd first explicitly, so there's definitely only one
fd left.
2022-09-01 22:06:19 +02:00
Lennart Poettering
247738b4f5 loop-util: drop code to attach empty file
Back when I wrote this code I wasn't aware of BLKPG and what it can do.
Hence I came up with this hack to attach an empty file to delete all
partitions. But today we can do better with BLKPG: let's just explicitly
remove all partitions, and then try again.
2022-09-01 22:06:19 +02:00
Lennart Poettering
7f52206a2b loop-util: rework how we lock loopback block devices
Let's rework how we lock loopback block devices in two ways:

1. Lock a separate fd, instead of the main block device fd. We already
   did that for our internal locking when allocating loopback block
   devices, but do so for the exposed locking (i.e.
   loop_device_flock()), too, so that the lock is independent of the
   main fd we actually use of IO.

2. Instead of locking the device during allocation of the loopback
   device, then unlocking it (which will make udev run), and then
   re-locking things if we need, let's instead just keep the lock the
   whole time, to make things a bit safer and faster, and not have to
   wait for udev at all. This is done by adding a "lock_op" parameter to
   loop device allocation functions that declares the initial state of
   the lock, and is one of LOCK_UN/LOCK_SH/LOCK_EX. This change also
   shortens a lot of code, since we allocate + immediately lock loopback
   devices pretty much everywhere.
2022-09-01 22:05:32 +02:00
Lennart Poettering
3a6ed1e19d loop-util: when clearing a loopback device delete partitions first, and take BSD lock
Whenever we release a loopback device, let's first synchronously delete
all partitions, so that we know that's complete and not done
asynchronously in the background. Take a BSD lock on the device while
doing so, so that udev won't make the devices busy while we do this.
2022-09-01 20:41:08 +02:00
Lennart Poettering
ff27ef4b59 loop: convert impossibe EBADF cases into asserts 2022-09-01 20:40:01 +02:00
Lennart Poettering
ed13feff1e loop-util: use DEVNUM_FORMAT_STR more 2022-09-01 16:00:45 +02:00
Lennart Poettering
91e1ce1a7c loop-util: move resize partition ioctl call to blockdev-util.[ch]
The other BLKPG calls have wrappers in blockdev-util.[ch], let's place
them all there.

No change in behaviour.
2022-09-01 15:59:54 +02:00
Lennart Poettering
9d72a3cf70 loop: make 'Failed to configure loopback device' log message clearer
We print the very same log message for loopback block devices and for
loopback network devices. Let's better be clear what kind it is.
2022-08-29 15:15:08 +02:00
Yu Watanabe
ca8228295e tree-wide: use devpath_from_devnum() and device_open_from_devnum()
Fixes #24465.
2022-08-28 10:10:50 +09:00
Yu Watanabe
5c467ef4fb loop-util: use filter provided by sd_device_enumerator 2022-08-27 11:32:11 +00:00
Daan De Meyer
24d59aeed3 loop-util: Add loop_device_unrelinquish()
Allows taking ownership of a loop device which makes sure that
loop_device_unrefp() will try to destroy it when it runs.
2022-08-03 20:55:32 +02:00
Lennart Poettering
7176f06c9e basic: split out dev_t related calls into new devno-util.[ch]
No actual code changes, just splitting out of some dev_t handling
related calls from stat-util.[ch], they are quite a number already, and
deserve their own module now I think.

Also, try to settle on the name "devnum" as the name for the concept,
instead of "devno" or "dev" or "devid". "devnum" is the name exported in
udev APIs, hence probably best to stick to that. (this just renames a
few symbols to "devum", local variables are left untouched, to make the
patch not too invasive)

No actual code changes.
2022-04-13 16:26:31 +02:00
Lennart Poettering
3b195f63fc loop-util: add debug message with details about acquired loopback device 2022-04-07 18:55:58 +02:00
Lennart Poettering
3e9210577d loop-util: explicitly close loopback block device before sleeping
attach_empty() file takes a BSD file lock on the device, and we really
should release that before going to sleep. hence explicitly close the
block device before the sleep instead of relying on _cleanup_ to close
it after the sleep.
2022-04-07 18:55:58 +02:00
Lennart Poettering
49043f8115 loop-util: use ERRNO_IS_DEVICE_ABSENT() macro where appropriate 2022-04-07 18:55:58 +02:00
Lennart Poettering
cc53046620 loop-util: take a LOCK_EX BSD file lock on control device while we acquire a loopback device 2022-04-07 18:55:58 +02:00
Lennart Poettering
7ffc7f3fcc loop-util: slightly rework device_has_block_children()
Let's match by devtype, i.e. the official way to distinguish "whole"
block devices from partitions.

Also add debug logging for devices we thus ignore.
2022-04-07 18:55:58 +02:00
Lennart Poettering
a145f8c06c loop-util: let's cut trailing whitespace, not trailing lines
This doesn't really make any real difference, given the file should only
contain a single line. But it's conceptually more correct to just remove
the trailing newline/whitespace then the whole lines coming after that.
i.e. if the file actually contains more lines than one, this should
probably be considered an error.
2022-04-07 18:55:58 +02:00
Yu Watanabe
7e93a65868 fd-util: rename loop_get_diskseq() -> fd_get_diskseq()
And move it from loop-util.[ch] -> fd-util.[ch]
2022-04-01 15:13:18 +09:00
Yu Watanabe
2076612f84 basic/missing: move BLKGETDISKSEQ to missing_fs.h
As it is defined at linux/fs.h.
2022-04-01 15:13:18 +09:00
Lennart Poettering
7c248223eb tree-wide: use new RET_NERRNO() helper at various places 2021-11-16 08:04:09 +01:00
Lennart Poettering
d7654742ee loop-util: reopen device node if we shortcut loop device creation
The LoopDevice object supports a shortcut: if the backing fd we are
supposed to create a loopback device of refers to a
block device alrady then we'll use it as is – if we can – instead of
setting up an unnecessary loopback device that would be pretty much
the same as its backing device.

Previously, when doing this we'd just dup() the original backing fd and
use that. But that's problematic in case O_DIRECT was set on the fd,
since we'll keep that flag set on our copy too, which means we can't do
simple, regular IO on it anymore.

Thus, let's reopen the inode in this case with the exact access flags
we'd apply if we'd actually allocate and open a new loopback device.

Fixes: #21176
2021-11-05 07:08:16 +00:00
Lennart Poettering
0193b93eb5 loop-util: call loop_device_make_internal() at the right place
The whole reason loop_device_make_internal() exists (as opposed to just
loop_device_make()) is to avoid mangling the loop flags value/call
getenv twice. Hence let's actually call it when we already mangled the
flags value.
2021-10-20 09:57:16 +02:00
Lennart Poettering
aa4d3aa3ef loop-util: add debug logging about O_RDWR vs. O_RDONLY + O_DIRECT mode
Once we managed to open the file let's log what we wanted and what we
got.
2021-10-20 09:56:20 +02:00
Lennart Poettering
bfd084454d loop-util: minor coding style updates
As suggested here: https://github.com/systemd/systemd/pull/21044#pullrequestreview-783530343
2021-10-20 09:55:33 +02:00
Lennart Poettering
b9a9748abc loop-util: work around cache invalidation bug in older kernels
Inspired by the discussions in #21003.

Inspired in particular by what Android apexd does:

https://android.googlesource.com/platform/system/apex/+/refs/heads/master/apexd/apexd_loop.cpp
2021-10-19 15:38:21 +02:00
Lennart Poettering
e8c7c4d9d1 loop-util: enable LO_FLAGS_DIRECT_IO by default on loopback devices
Fixes: #21003
2021-10-19 15:38:21 +02:00
Luca Boccassi
bcef1743a5 loop: parse and store disk sequence number
When loop devices are re-used, the disk sequence number is increased.
Parse it when creating a loop device and store it.
The kernel will never return DISKSEQ=0, so use it to signal that it's
not supported by the current kernel.
2021-07-28 19:59:38 +01:00
Lennart Poettering
8ede1e86b2 loop-util: track CLOCK_MONOTONIC timestamp immediately before attaching a loopback device
This is similar to the preceding work to store the uevent seqnum, but
this stores the CLOCK_MONOTONIC timestamp.

Why? This allows to validate udev database entries, to determine if they
were created *after* we attached the device.

The uevent seqnum logic allows us to validate uevent, and the timestamp
database entries, hence together we should be able to validate both
sources of truth for us.

(note that this is all racy, just a bit less racy, since we cannot
atomically attach loopback devices and get the timestamp for it, the
same way we can't get the uevent seqnum. Thus is shortens the race
window, but doesn#t close it).
2021-04-20 17:20:38 +02:00
Lennart Poettering
31c75fcc41 loop-util: read kernel's uevent seqnum right before attaching a loopback device
Later, this will allow us to ignore uevents from earlier attachments a
bit better, as we can compare uevent seqnums with this boundary. It's
not a full fix for the race though, since we cannot atomically determine
the uevent and attach the device, but it at least shortens the window a
bit.
2021-04-20 17:13:56 +02:00
Lennart Poettering
79e8393a6a loop-util: initialize .devno in loop_device_open() too 2021-04-20 17:12:39 +02:00
Lennart Poettering
b0dbffd868 loop-util: port to random_u64_range()
Doesn't matter, but it's a bit easier to read I'd claim.
2021-04-20 17:12:39 +02:00
Lennart Poettering
38bd449f96 loop-util: make loop_device_make() return fd in all code paths
Previously, loop_device_make() would return the device fd in one success
code path, but not the other (where' we'd just return 0).
loop_device_open() returns it in all cases.

Hence, let's clean this up, and make sure in all success code paths of
both functions we return it (even though it strictly speaking is
redundant, since we return it in LoopDevice anyway, and currently noone
actually relies on this).
2021-04-20 17:12:39 +02:00
Lennart Poettering
f3859d5f55 loop-util: store device major/minor in LoopDevice object
Let's store this away. It's useful when matching up mounts (i.e.  struct
stat's .st_dev field) with loopback devices.
2021-04-19 23:16:02 +02:00
Yu Watanabe
273d76f4f8 tree-wide: update "that that" 2020-11-18 17:23:00 +09:00
Yu Watanabe
db9ecf0501 license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
Yu Watanabe
377a9545e9 tree-wide: fix typos found by Fossies codespell report 2020-10-24 13:29:31 +02:00
Lennart Poettering
b202ec2068 loop-util: wait a random time before trying again
Let's try to make collisions when multiple clients want to use the same
device less likely, by sleeping a random time on collision.

The loop device allocation protocol is inherently collision prone:
first, a program asks which is the next free loop device, then it tries
to acquire it, in a separate, unsynchronized setp. If many peers do this
all at the same time, they'll likely all collide when trying to
acquire the device, so that they need to ask for a free device again and
again.

Let's make this a little less prone to collisions, reducing the number
of failing attempts: whenever we notice a collision we'll now wait
short and randomized time, making it more likely another peer succeeds.

(This also adds a similar logic when retrying LOOP_SET_STATUS64, but
with a slightly altered calculation, since there we definitely want to
wait a bit, under all cases)
2020-10-22 14:58:28 +02:00
Lennart Poettering
021bf17528 loop-util: if a loopback device we want to use still has partitions, do something about it
On current kernels (5.8 for example) under some conditions I don't fully
grok it might happen that a detached loopback block device still has
partition block devices around. Accessing these partition block devices
results in EIO errors (that also fill up dmesg). These devices cannot be
claned up with LOOP_CLR_FD (since the main device already is officially
detached), nor with LOOP_CTL_DELETE (returns EBUSY as long as the
partitions still exist). This is a kernel bug. But it appears to apply
to all recent kernels. I cannot really pin down what triggers this,
suffice to say our heavy-duty test can trigger it.

Either way, let's do something about it: when we notice this state we'll
attach an empty file to it, which is guaranteed to have to part table.
This makes the partitions go away. After closing/reoping the device we
hence are good to go again. ugly workaround, but I think OK enough to
use.

The net result is: with this commit, we'll guarantee that by the time we
attach a file to the loopback device we have zero kernel partitions
associated with it. Thus if we then wait for the kernel partitions we
need to appear we should have entirely reliable behaviour even if
loopback devices by the name are heavily recycled and udev events reach
us very late.

Fixes: #16858
2020-10-22 14:58:27 +02:00
Lennart Poettering
95c5009248 loop-util: LOOP_CLR_FD is async, don't retry to reuse a device right after issuing it
When we fall back to classic LOOP_SET_FD logic in case LOOP_CONFIGURE
didn't work we issue LOOP_CLR_FD first. But that call turns out to be
potentially async in the kernel: if something else (let's say
udev/blkid) is accessing the device the ioctl just sets the autoclear
flag and exits. Hence quite often the LOOP_SET_FD will subsequently
fail. Let's avoid the trouble, and immediately exit with EBUSY if
LOOP_CONFIGURE fails, and but remember that LOOP_CONFIGURE is not
available so that on the next iteration we go directly for LOOP_SET_FD
instead.
2020-10-22 14:58:27 +02:00
Lennart Poettering
738f29cb53 loop-util: handle EAGAIN on LOOP_SET_STATUS64
Since
5db470e229 (i.e. kernel 5.0)
changing the .lo_offset field via LOOP_SET_STATUS64 might result in
EAGAIN. Let's handle that.

Fixes: #16858
2020-10-22 14:58:27 +02:00
Lennart Poettering
77ad674b51 loop-util: apparently opening a loop device sometimes results in ENXIO, handle this 2020-09-25 16:03:05 +02:00
Lennart Poettering
0950526afd loop-util: use right flags field 2020-09-25 16:02:56 +02:00
Lennart Poettering
bb2551bdcb loop-util: LOOP_CONFIGURE ignores lo_sizelimit
It appears LOOP_CONFIGURE in 5.8 is even more broken than initially
thought: it doesn't properly propgate lo_sizelimit to the block device
layer. :-(

Let's hence check the block device size immediately after issuing
LOOP_CONFIGURE, and if it doesn't match what we just set let's fallback
to the old ioctls.

This means LOOP_CONFIGURE currently works correctly only for the most
simply case: no partition table logic and no size limit. Sad!

(Kernel people should really be told about the concepts of tests and
even CI, one day!)
2020-08-24 22:01:13 +02:00
Lennart Poettering
8dbc208cc1 loop-util: define API for syncing loopback device 2020-08-24 21:59:35 +02:00
Lennart Poettering
86c1c1f345 loop-util: use new LOOP_CONFIGURE ioctl
LOOP_CONFIGURE allows us to configure a loopback device in one ioctl
instead of two, which is not just faster but also removes the race that
udev might start probing the device before we adjusted things properly.

Unfortunately LOOP_CONFIGURE is broken in regards to LO_FLAGS_PARTSCAN
as of kernel 5.8.0. This patch contains a work-around for that, to
fallback to old behaviour if partition scanning is requested but does
not work. Sucks a bit.

Proposed upstream fix for that issue:

https://lkml.org/lkml/2020/8/6/97
2020-08-11 15:24:18 +02:00
Lennart Poettering
cae1e8fb88 loop-device: implicitly sync device on detach
Apparently, if IO is still in flight at the moment we invoke LOOP_CLR_FD
it is likely simply dropped (probably because yanking physical storage,
such as a USB stick would drop it too). Let's protect ourselves against
that and always sync explicitly before we invoke it.
2020-07-30 20:56:13 +02:00
Lennart Poettering
b0a94268f8 core: when we cannot open an image file for write, try read-only
Closes: #14442
2020-01-09 11:18:06 +01:00