/dev/urandom is seeded with RDRAND. Calling genuine_random_bytes(...,
..., 0) will use /dev/urandom as a last resort. Hence, we gain nothing
here by having our own RDRAND wrapper, because /dev/urandom already is
based on RDRAND output, even before /dev/urandom has fully initialized.
Furthermore, RDRAND is not actually fast! And on each successive
generation of new x86 CPUs, from both AMD and Intel, it just gets
slower.
This commit simplifies things by just using /dev/urandom in cases where
we before might use RDRAND, since /dev/urandom will always have RDRAND
mixed in as part of it.
And above where I say "/dev/urandom", what I actually mean is
GRND_INSECURE, which is the same thing but won't generate warnings in
dmesg.
RANDOM_BLOCK has existed for a long time, but RANDOM_ALLOW_INSECURE was
added more recently, leading to an awkward relationship between the two.
It turns out that only one, RANDOM_BLOCK, is needed.
RANDOM_BLOCK means return cryptographically secure numbers no matter
what. If it's not set, it means try to do that, but if it fails, fall
back to using unseeded randomness.
This part of falling back to unseeded randomness is the intent of
GRND_INSECURE, which is what RANDOM_ALLOW_INSECURE previously aliased.
Rather than having an additional flag for that, it makes more sense to
just use it whenever RANDOM_BLOCK is not set. This saves us the overhead
of having to open up /dev/urandom.
Additionally, when getrandom returns too little data, but not zero data,
we currently fall back to using /dev/urandom if RANDOM_BLOCK is not set.
This doesn't quite make sense, because if getrandom returned seeded data
once, then it will forever after return the same thing as whatever
/dev/urandom does. So in that case, we should just loop again.
Since there's never really a time where /dev/urandom is able to return
some easily but more with difficulty, we can also get rid of
RANDOM_EXTEND_WITH_PSEUDO. Once the RNG is initialized, bytes
should just flow normally.
This also makes RANDOM_MAY_FAIL obsolete, because the only case this ran
was where we'd fall back to /dev/urandom on old kernels and return
GRND_INSECURE bytes on new kernels. So also get rid of that flag.
Finally, since we're always able to use GRND_INSECURE on newer kernels,
and we only fall back to /dev/urandom on older kernels, also only fall
back to using RDRAND on those older kernels. There, the only reason to
have RDRAND is to avoid a kmsg entry about unseeded randomness.
The result of this commit is that we now cascade like this:
- Use getrandom(0) if RANDOM_BLOCK.
- Use getrandom(GRND_INSECURE) if !RANDOM_BLOCK.
- Use /dev/urandom if !RANDOM_BLOCK and no GRND_INSECURE support.
- Use /dev/urandom if no getrandom() support.
- Use RDRAND if we would use /dev/urandom for any of the above reasons
and RANDOM_ALLOW_RDRAND is set.
kernel 5.6 added support for a new flag for getrandom(): GRND_INSECURE.
If we set it we can get some random data out of the kernel random pool,
even if it is not yet initializated. This is great for us to initialize
hash table seeds and such, where it is OK if they are crap initially. We
used RDRAND for these cases so far, but RDRAND is only available on
newer CPUs and some archs. Let's now use GRND_INSECURE for these cases
as well, which means we won't needlessly delay boot anymore even on
archs/CPUs that do not have RDRAND.
Of course we never set this flag when generating crypto keys or uuids.
Which makes it different from RDRAND for us (and is the reason I think
we should keep explicit RDRAND support in): RDRAND we don't trust enough
for crypto keys. But we do trust it enough for UUIDs.
The old flag name was a bit of a misnomer, as /dev/urandom cannot be
"drained". Once it's initialized it's initialized and then is good
forever. (Only /dev/random has a concept of 'draining', but we never use
that, as it's an obsolete interface).
The flag is still useful though, since it allows us to suppress accesses
to the random pool while it is not initialized, as that trips up the
kernel and it logs about any such attempts, which we really don't want.
Rename rdrand64 to rdrand, and switch from uint64_t to unsigned long.
This produces code that will compile/assemble on both x86-64 and x86-32.
This could be useful when running a 32-bit copy of systemd on a modern
Intel processor.
RDRAND is inherently arch-specific, so relying on the compiler-defined
'long' type seems reasonable.
We only use this when we don't require the best randomness. The primary
usecase for this is UUID generation, as this means we don't drain
randomness from the kernel pool for them. Since UUIDs are usually not
secrets RDRAND should be goot enough for them to avoid real-life
collisions.
Originally, the high_quality_required boolean argument controlled two
things: whether to extend any random data we successfully read with
pseudo-random data, and whether to return -ENODATA if we couldn't read
any data at all.
The boolean got replaced by RANDOM_EXTEND_WITH_PSEUDO, but this name
doesn't really cover the second part nicely. Moreover hiding both
changes of behaviour under a single flag is confusing. Hence, let's
split this part off under a new flag, and use it from random_bytes().
It's more descriptive, since we also have a function random_bytes()
which sounds very similar.
Also rename pseudorandom_bytes() to pseudo_random_bytes(). This way the
two functions are nicely systematic, one returning genuine random bytes
and the other pseudo random ones.
Pretty much all intel cpus have had RDRAND in a long time. While
CPU-internal RNG are widely not trusted, for seeding hash tables it's
perfectly OK to use: we don't high quality entropy in that case, hence
let's use it.
This is only hooked up with 'high_quality_required' is false. If we
require high quality entropy the kernel is the only source we should
use.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
During early boot, we'd call getrandom(), and immediately fall back to
reading from /dev/urandom unless we got the full requested number of bytes.
Those two sources are the same, so the most likely result is /dev/urandom
producing some pseudorandom numbers for us, complaining widely on the way.
Let's change our behaviour to be more conservative:
- if the numbers are only used to initialize a hash table, a short read is OK,
we don't really care if we get the first part of the seed truly random and
then some pseudorandom bytes. So just do that and return "success".
- if getrandom() returns -EAGAIN, fall back to rand() instead of querying
/dev/urandom again.
The idea with those two changes is to avoid generating a warning about
reading from an /dev/urandom when the kernel doesn't have enough entropy.
- only in the cases where we really need to make the best effort possible
(sd_id128_randomize and firstboot password hashing), fall back to
/dev/urandom.
When calling getrandom(), drop the checks whether the argument fits in an int —
getrandom() should do that for us already, and we call it with small arguments
only anyway.
Note that this does not really change the (relatively high) number of random
bytes we request from the kernel. On my laptop, during boot, PID 1 and all
other processes using this code through libsystemd request:
74780 bytes with high_quality_required == false
464 bytes with high_quality_required == true
and it does not eliminate reads from /dev/urandom completely. If the kernel was
short on entropy and getrandom() would fail, we would fall back to /dev/urandom
for those 464 bytes.
When falling back to /dev/urandom, don't lose the short read we already got,
and just read the remaining bytes.
If getrandom() syscall is not available, we fall back to /dev/urandom same
as before.
Fixes#4167 (possibly partially, let's see).
The only implementation that we care about — glibc — provides us
with 31 bits of entropy. Let's use 24 bits of that, instead of throwing
all but 8 away.