So Linux has this (insane — in my opinion) "feature" that if you name a
network interface "foo%d" then it will automatically look for the
interface starting with "foo…" with the lowest number that is not used
yet and allocates that.
We should never clash with this "magic" handling of ifnames, hence
refuse this, since otherwise we never know what the name is we end up
with.
We should probably switch things from a deny list to an allow list
sooner or later and be much stricter. Since the kernel directly enforces
only very few rules on the names, we'd need to do some research what is
safe and what is not first, though.
Numeric ifnames should be acceptable only if that's enabled by flag, and
refused otherwise. Hence, let's parse as ifindex first, and if that
works decide. Finally, let's refuse any numeric ifnames that are not
valid ifindexs, but look like them.
A variety of sockopts exist both for IPv4 and IPv6 but require a
different pair of sockopt level/option number. Let's add helpers for
these that internally determine the right sockopt to call.
This should shorten code that generically wants to support both ipv4 +
ipv6 and for the first time adds correct support for some cases where we
only called the ipv4 versions, and not the ipv6 options.
We would set .type to a fake value. All real callers (outside of tests)
immediately overwrite .type with a proper value after calling
socket_address_parse(). So let's not set it and adjust the few places
that relied on it being set to the fake value.
socket_address_parse() is modernized to only set the output argument on
success.
If the interface scope is specified, this changes the meaning of the address
quite significantly. Let's show the IPv6 scope_id if present.
Sadly we don't even have a test for sockaddr_pretty() output :(
This will be implicitly tested through socket_address_parse() later on.
The commit 10ce2e0681 inverts the order of
SO_{RCV,SND}BUFFORCE and SO_{RCV,SND}BUF. However, setting buffer size with
SO_{RCV,SND}BUF does not fail even if the requested size is larger than
the kernel limit. Hence, SO_{RCV,SND}BUFFORCE will not use anymore and
the buffer size is always limited by the kernel limit even if we have
the priviledge to ignore the limit.
This makes the buffer size is checked after configuring it with
SO_{RCV,SND}BUF, and if it is still not sufficient, then try to set it
with FORCE command. With this commit, if we have enough priviledge, the
requested buffer size is correctly set.
Hopefully fixes#14417.
Prompted by the discussion on #16110, let's migrate more code to
fd_wait_for_event().
This only leaves 7 places where we call into poll()/poll() directly in
our entire codebase. (one of which is fd_wait_for_event() itself)
poll() sets POLLNVAL inside of the poll structures if an invalid fd is
passed. So far we generally didn't check for that, thus not taking
notice of the error. Given that this specific kind of error is generally
indication of a programming error, and given that our code is embedded
into our projects via NSS or because people link against our library,
let's explicitly check for this and convert it to EBADF.
(I ran into a busy loop because of this missing check when some of my
test code accidentally closed an fd it shouldn't close, so this is a
real thing)
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.
Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.
Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
Let's be extra careful whenever we return from recvmsg() and see
MSG_CTRUNC set. This generally means we ran into a programming error, as
we didn't size the control buffer large enough. It's an error condition
we should at least log about, or propagate up. Hence do that.
This is particularly important when receiving fds, since for those the
control data can be of any size. In particular on stream sockets that's
nasty, because if we miss an fd because of control data truncation we
cannot recover, we might not even realize that we are one off.
(Also, when failing early, if there's any chance the socket might be
AF_UNIX let's close all received fds, all the time. We got this right
most of the time, but there were a few cases missing. God, UNIX is hard
to use)
In subsequent commits, calls to if_nametoindex() will be replaced by a wrapper
that falls back to alternative name resolution over netlink. netlink support
requires libsystemd (for sd-netlink), and we don't want to add any functions
that require netlink in basic/. So stuff that calls if_nametoindex() for user
supplied interface names, and everything that depends on that, needs to be
moved.
So apparently there are two reasons why accept() can return EOPNOTSUPP:
because the socket is not a listening stream socket (or similar), or
because the incoming TCP connection for some reason wasn't acceptable to
the host. THe latter should be a transient error, as suggested on
accept(2). The former however should be considered fatal for
flush_accept(). Let's fix this by explicitly checking whether the socket
is a listening socket beforehand.
socket_bind_to_ifindex() uses the the SO_BINDTOIFINDEX sockopt of kernel
5.0, with a fallback to SO_BINDTODEVICE on older kernels.
socket_bind_to_ifname() is a trivial wrapper around SO_BINDTODEVICE, the
only benefit of using it instead of SO_BINDTODEVICE directly is that it
determines the size of the interface name properly so that it also works
for unbinding. Moreover, it's an attempt to unify our invocations of the
sockopt with a size of strlen(ifname) rather than strlen(ifname)+1...
Linux is stupid and sometimes returns a "struct sockaddr_un" that is
longer than its fields, as it NUL terminates .sun_path[] even if it has
full length. ubsan detects this, rightfully. Since this is a Linux
misdesign let's trick out ubsan a bit.
Fixes: #11024