With EXTRACT_UNESCAPE_SEPARATORS, backslash is used to escape the separator.
But it wasn't possible to insert the backslash itself. Let's allow this and
add test.
A test for stripping of escaped backslashes without any flags was explicitly
added back in 4034a06ddb. So it seems to be on
purpose, though I would say that this is at least surprising and hence deserves
a comment.
In test-extract-word, add tests for standalone EXTRACT_UNESCAPE_SEPARATORS.
Only behaviour combined with EXTRACT_CUNESCAPE was tested.
So Linux has this (insane — in my opinion) "feature" that if you name a
network interface "foo%d" then it will automatically look for the
interface starting with "foo…" with the lowest number that is not used
yet and allocates that.
We should never clash with this "magic" handling of ifnames, hence
refuse this, since otherwise we never know what the name is we end up
with.
We should probably switch things from a deny list to an allow list
sooner or later and be much stricter. Since the kernel directly enforces
only very few rules on the names, we'd need to do some research what is
safe and what is not first, though.
This fixes two checks where we compare string sizes when validating with
FILENAME_MAX. In both cases the check apparently wants to check if the
name fits in a filename, but that's not actually what FILENAME_MAX can
be used for, as it — in contrast to what the name suggests — actually
encodes the maximum length of a path.
In both cases the stricter change doesn't actually change much, but the
use of FILENAME_MAX is still misleading and typically wrong.
It's so similar to copy_access(), hence let's move it over and rename it
in similar style to the rest of the functions.
No change in behaviour, just moving things over.
REMOVE_CHMOD is necessary to remove files/dirs that are owned by us but
have an access mode that would not allow us to remove them. In generic
destructor calls for use with `_cleanup_` that are "fire-and-forget"
style we should make use of that, to maximize the chance we can actually
remove the files/dirs.
(Also, add in REMOVE_MISSING_OK. Just because prettier, we ignore the
return codes anyway, but it' a bit nicer to ignore a bit fewer errors.)
Let's fine-tune the path_extract_filename() interface: on succes return
O_DIRECTORY as indicator that the input path was slash-suffixed, and
regular 0 otherwise. This is useful since in many cases it is useful to
filter out paths that must refer to dirs early on.
I opted for O_DIRECTORY instead of the following other ideas:
1. return -EISDIR: I think the function should return an extracted
filename even when referring to an obvious dir, so this is not an
option.
2. S_ISDIR, this was a strong contender, but I think O_DIRECTORY is a
tiny bit nicer since quite likely we will go on and open the thing,
maybe with openat(), and hence it's quite nice to be able to OR in
the return value into the flags argument of openat().
3. A new enum defined with two values "dont-know" and
"definitely-directory". But I figured this was unnecessary, given we
have other options too, that reuse existing definitions for very
similar purposes.
These two together are a lot like dirname() + basename() but have the
benefit that they return clear errors when one passes a special case
path to them where the extraction doesn't make sense, i.e. "", "/",
"foo", "foo/" and so on.
Sooner or later we should probably port all our uses of
dirname()/basename() over to this, to catch these special cases more
safely.