Despite systemd being stable enough for use in modern Linux systems, we still continue seeing sysvinit widely used, especially on embedded systems. The parallel nature of systemd sure has a positive side, such as a faster boot time, but sometimes we just need to have more control over the order of services initialization, and we have this possibility when using sysvinit.
It is a common practice in embedded Linux development to use sysvinit instead of systemd, which we did as well.
After updating openssl from 1.0.2 to 3.0.0-alpha12 we faced issues during boot. The system just hung on starting openssl-related services, such as: dbus, bind, syslog-ng, sshd, lighttpd.
After disabling all openssl related services (just manually editing init.d scripts and putting `exit 0` there before it actually doing something), we finally got the login prompt working. Then, if we would try to start, for example, SSH daemon – it would hang. With the strace we can see when it happened:
…
brk(0xaaab14c87000) = 0xaaab14c87000
brk(0xaaab14caf000) = 0xaaab14caf000
brk(0xaaab14cd7000) = 0xaaab14cd7000
brk(0xaaab14cf8000) = 0xaaab14cf8000
brk(0xaaab14d19000) = 0xaaab14d19000
getrandom(
So, that’s a getrandom call.
The investigation led us to multiple posts all over the internet about the same topic: performance issues and hangs related to entropy with new Openssl 1.1.
https://github.com/openssl/openssl/issues/10463
https://github.com/openssl/openssl/issues/9078
Let’s start with a short description of random generators in Linux. Usually, we have `/dev/random’ and ‘/dev/urandom’. ‘/dev/random’ generates random numbers using an entropy pool (this pool gathered from device noise, mouse movements and so on, this pool is filling very slowly). These random numbers can be used for generating SSH keys etc, because they are really random. `/dev/urandom’ in other cases, generates pseudo random numbers, which can be predicted, but it’s fast and doesn’t block because of an exhausted entropy pool (urandom just doesn’t use it).
Long story short, since openssl 1.1 we have a special mechanism for checking entropy to assure safety. `/dev/random` used as a default random device, but it will not work till we don’t have entropy. During openssl configuration state it’s possible to change it to `/dev/urandom` but this is a very bad idea because it will generate the same random numbers for each boot and this is not safe at all. In our image, we have rng-tools that is used to increase entropy, but this daemon just never been called, because of this openssl-related services hang. With systemd it should be working (despite some performance issues).
So, in our case, the solution was in changing the priority of rngd (rng-tools) to call it before starting dbus and others. After this small change, we have enough entropy to start OpenSSL related services as intended.