Encrypted disk not found during boot
Open, HighPublic

Description

I upgrade and full-upgrade my Librem 15v3 every other day or so. The last two times I had the following problem when I tried to reboot after full-upgrade: I would not be shown a prompt to enter my disk encryption password. Switching to verbose (with ESC) I saw this:

WARNING: Failed to connect to lvmetad. Falling back to device scanning. 
Volume "crypt" not found. 
Cannot process volume group crypt. 
cryptsetup: Waiting for encrypted source device UUID=[some address]

Then after maybe 20 minutes:

[  569.180755] random: crng init done
	ALERT! encrypted source device UUID=[some address] does not exit, can't unlock sda5_crypt. 
	Check cryptopts=source= bootarg: cat /proc/cmdline
	or missing modules, devices: cat /proc/modules; ls /dev

(initramfs)

I rebooted several times, tried various kernel versions, with and without recover mode. Then, maybe at the sixths trial, all of a sudden I would get a prompt to enter the encryption password (I think in recover mode). After that everything seems fine again, but I am afraid this will happen again when I reboot the system.

How to find the root of this problem, and how to fix it? Thanks a lot for any help!

EDIT: The content of /etc/crypttab is this:

sda5_crypt UUID=fd2a8340-c474-40e6-9be2-0e06e5762857 none luks
Nils created this task.Sep 16 2018, 1:03 PM
mladen triaged this task as "High" priority.Sep 17 2018, 2:53 PM
mladen assigned this task to mak.
Nils edited the task description. (Show Details)Oct 2 2018, 12:37 PM

I had this issue on the OEM version of PureOS 8 Beta 1, Librem 15 v3. I ended up doing a full reinstall of PureOS.

My suspicion is that it relates to LUKS and the swap partition... as the UUID I saw in errors was related to that.

It would not let me boot with any of the fixes mentioned for this or similar issues, and using a "gparted live" (debian) USB key didn't fix the situation, even after fsck on all drives and reformatting swap.

This comment was removed by sean.obrien.
hansolo added a subscriber: hansolo.EditedJan 10 2019, 10:11 AM

I have a brandnew Librem 13 and installed the newest available PureOS version. Updated it in full and now it doesn't boot in normal or recovery mode, so I suppose it will be quite easy to replicate this issue..

First EDIT: I use a NVMe (970 Pro from Samsung) harddrive, maybe this is the reason for this issue?

Second EDIT: Reinstalled PureOS and upgraded it with apt-get instead of apt and used dist-upgrade instead of upgrade as the parameter. It's working now.

This FAQ may help: https://gitlab.com/cryptsetup/cryptsetup/wikis/FrequentlyAskedQuestions

@sean.obrien It doesn't seem like a swap issue since swap is not appearing in the /etc/crypttab. Checking in /etc/fstab (file system table) will identify all luks drives, or it should. Also doing an ls /dev/mapper/ ought to show all encrypted UUIDs.

My suspicion is that your issue Nils is here:

[  569.180755] random: crng init done

If this is a new drive or even if it's been refreshed there may be a lack of entropy in the system at boot. The random number generator needs entropy for encryption operations and it is initializing that. (crng is the congruential random number generator) So your patient wait for that to initialize likely fixed the problem and you should not have any issues upon reboot.

You can check how much entropy you have this way

cat /proc/sys/kernel/random/entropy_avail

After recently upgrading my kernel I had a very similar issue. You basically roll the dice to see if the new kernel will actually boot. Sometimes it will. More often it won't; it will just hang forever with a message much like this one:

[ ***] A start job is running for /dev/mapper/luks-204dc5ca-8fbd-4b02-9833-3661ffd0c0aa (9min 38s / no limit)

I was able to verify that the UUID referenced above is my swap drive. Unfortunately contacting Purism support didn't yield any answers, other than the advice that I should reinstall PureOS from scratch.

Since it's just swap that's the problem, I tried running sudo swapoff -a followed by sudo swapon -a to clear it, though this only seemed to have a placebo effect. I also tried running sudo update-initramfs -u but that didn't help either. That latter command did output this warning message, though:

cryptsetup: WARNING: Resume target luks-204dc5ca-8fbd-4b02-9833-3661ffd0c0aa uses a key file

I'm not sure what the "key file" would be or if that's needed to fix the underlying issue. I'm wondering: is the swap drive really necessary? If I'm going to have to reinstall anyway, perhaps I should just partition it without a swap drive. That way I could avoid any future issues with swap. But I'm not aware of possible drawbacks.

I really wish there was a solution besides just reinstalling the whole OS. I'm reaching out here as a final resort before I do that. Would posting the contents of my /etc/crypttab file be helpful? (Is that safe to post online or does it need to have any parts blacked out?) The fact that running a simple upgrade through normal channels can totally hose your system, requiring a re-image, is worrying. I wouldn't expect an OS to be that brittle.

Regarding swap, yes you can run without swap. It can be nice to have however if you don't have a lot of memory.
Regarding cryptsetup, that is a warning and not an error, this shouldn't be preventing your boot

It is extremely difficult to remotely diagnose what is happening on a given system without log data. If you were to run this command it would help diagnosing the issue;

systemd-analyze blame

Hi Jeremiah, I understand the difficulty, and would love to provide as much detail as I can. I've attached the output of that command, though (spoiler alert) I don't think it's terribly helpful: systemd-analyze-blame.txt

Awesome! Thanks @soapergem! Have you configured Exim (a mail transfer agent) to run on your laptop? If not my recommendation would be to disable it. You can disable it, which just stops it from starting up, this way;

sudo systemctl disable exim4.service

You will still have the Exim MTA on your system if needed. Of course there is no need to disable it if you're using it. You can paste log output (/var/log/exim4/mainlog and /var/log/exim4/paniclog) if you want to dig deeper into what is causing the Exim slowness. But disabling it will make your boot time *much* faster.

Thanks @jeremiah.foster . I had not configured Exim to run, so I've followed your instructions here to disable it. But back on topic, I think my boot time already was fast, before I upgraded my kernel and ran into this issue where it just hangs forever while trying to boot. Disabling Exim definitely didn't fix that.

Unfortunately I have to say this has gone on long enough; my laptop has been severely debilitated by this issue since trying to upgrade the kernel. Having to mess with it and cross my fingers every time I boot just tells me PureOS isn't really ready for business. I'm going to switch to Ubuntu today.

I understand. My only comment is that Ubuntu's boot process is less free and overall protects your privacy a bit less, but I understand there is a trade-off between privacy and convenience that we all have to make personally.

Add Comment