Bug 24908 - Installer hangs at end of bootloader install step on some Lenovo machines (BIOS bug when writing to EFI NVRAM?)
Summary: Installer hangs at end of bootloader install step on some Lenovo machines (BI...
Status: NEW
Alias: None
Product: Mageia
Classification: Unclassified
Component: Installer (show other bugs)
Version: 7
Hardware: x86_64 Linux
Priority: release_blocker major
Target Milestone: ---
Assignee: ISO building group
QA Contact:
URL:
Whiteboard:
Keywords: NEEDINFO
Depends on:
Blocks:
 
Reported: 2019-06-05 17:29 CEST by Nikato Muirhead
Modified: 2019-06-08 12:47 CEST (History)
7 users (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments
Contents of journal.txt after fresh install (43.75 KB, application/x-xz)
2019-06-06 06:30 CEST, Nikato Muirhead
Details
install.log - fresh install (137.36 KB, text/plain)
2019-06-07 04:27 CEST, Nikato Muirhead
Details
Journal.log fresh install (464.86 KB, text/plain)
2019-06-07 04:28 CEST, Nikato Muirhead
Details
Script to test writing the rEFInd PreviousBoot NVRAM variable (1.09 KB, text/plain)
2019-06-08 12:47 CEST, Martin Whitaker
Details

Description Nikato Muirhead 2019-06-05 17:29:17 CEST
When installing Refind bootloader the install hangs


Version-Release number of selected component (if applicable):


How reproducible:always


Steps to Reproduce:
1.Select REFIND as bootloader (EFI install)
2.proceed to end of install
3.will get message that bootloader is installing
The bootloader will stay stuck at that. 
Restarting computer will verify that the bootloader did in fact install
Comment 1 Marja Van Waes 2019-06-05 18:58:11 CEST
Hi Nikato,

Which iso did you use to install from, one of the Lives (which one?) or which classical iso?

If it was one of the Lives, then please attach journal.txt that is the result of running in that installed system, as root:

   journalctl -ab1 > journal.txt

(Compress with xz if it's too large to attach)
((Note that on mga, you can compress it further by using "xz -9 --text"))

If it was a classical install, then please attach

   /root/drakx/report.bug.xz

from that installed system.

Thanks :-)

Keywords: (none) => NEEDINFO
Summary: Installer never confirms that bootloader install completed. => Installer never confirms that (REFIND) bootloader install completed.
CC: (none) => marja11
Assignee: bugsquad => isobuild

Comment 2 Nikato Muirhead 2019-06-06 05:21:31 CEST
First I installed Magaia 7rc  on empty hard drive The boot loader did install even though it did not indicate that install was successful. I had to force restart the computer.  I  Then I shrank linux partition and installed windows.  Windows of course took over the boot partiton via ESP I think. Finally I reinstalled Magaia expecting that the bootloader would overwrite the Windows bootloader. Unfortunately that did not happen so I tried going into the live DVD to see if I could  reinstall the REFIND bootloader from there.  Then I got this error. I get the same error when I install Magaia to an empty hard drive. So the REFIND boot loader never finishes installing.

The "drakboot" program has crashed with the following error:

  ESP device is unknown at /usr/lib/libDrakX/bootloader.pm line 2441.
  	...propagated at /usr/lib/libDrakX/any.pm line 269.
  	...propagated at /usr/libexec/drakboot line 49.
  Perl's trace:
  drakbug::bug_handler() called from /usr/libexec/drakboot:49

Used theme: Adwaita

This is a Lenovo Ideapad320IAP 

Installing Grub2 gave the exact same result. I am aware that there are a few lenovo specific bugs in grub. These few bugs may be making it very difficult for Magaia bootloaders to install.
Comment 3 Nikato Muirhead 2019-06-06 06:30:21 CEST
Created attachment 11064 [details]
Contents of journal.txt after fresh install
Comment 4 Martin Whitaker 2019-06-06 20:41:31 CEST
Unfortunately, because you had to force a restart, the end of the journal is corrupt, and missing the information I need. Could I ask you to:

1. Boot to the Live desktop.
2. Open a Terminal window.

3. At the command line prompt type

     draklive-install |& tee install.log

4. If/when the installer hangs, use Ctrl-C in the terminal window to kill it.
5. As root, run

    journalctl -ab > journal.log

6. Attach both the install.log and journal.log files here (compress with xz if necessary)

CC: (none) => mageia

William Kenney 2019-06-06 21:20:35 CEST

CC: (none) => wilcal.int
Priority: Normal => release_blocker

Marcel Raad 2019-06-06 22:32:42 CEST

CC: (none) => marci_r

Comment 5 Nikato Muirhead 2019-06-07 04:27:42 CEST
Created attachment 11066 [details]
install.log - fresh install

it is attached
Comment 6 Nikato Muirhead 2019-06-07 04:28:55 CEST
Created attachment 11067 [details]
Journal.log fresh install

it is attached

CC: (none) => nycnikato

Nikato Muirhead 2019-06-07 04:41:54 CEST

Summary: Installer never confirms that (REFIND) bootloader install completed. => Installer never confirms that (REFIND) bootloader install completed and the installer hangs.

Comment 7 Nikato Muirhead 2019-06-07 05:21:03 CEST
Just to give a little background, Lenovo are infamous for not letting bootloaders write to uefi. just search for lenovo in the grub-installer package section in bugs/launchpad.net. A fix for this would help thousands of frustrated people who use certain Lenovos but would like to be able to install on a UEFI system. and maybe using secureboot later. Lenovo wont do the work to fix it. Lenovo seems only to care about supporting Linux on their expensive business models.  

if you all can figure out a workaround for Lenovos that would be great. I think you all have actually done it, it is just that the installer does not end gracefully and does not confirm that the write task is completed. Maybe the installer should just know that for certain motherboards that is the normal result. Maybe to double check after the write attempt to see if the boot loader actually wrote despite the I/O error.  I will say that when it hangs I let it hang for about 5 minutes to make sure the Boot loader writes. When I restart too quickly after the hang it doesnt write. the installer should have a progress bar during the bootloader install. There is too much tension at the moment of truth. 
 
Whichever way EndlessOS does their installs should be looked at. They have figured out an installation approach that Workds with UEFI and Secureboot enabled on my crap Lenovo. I have never seen the endless OS installer fail on even the quirkiest of hardware. 

These are Lenovo specific grub bugs which can also be found at Savannah server at the grub project. because these are Lenovo specfic bugs which ultimately have to do with certain Lenovo bios  virtually all linux installs are affected. Again, EndlessOS has figured out what to do. 


https://bugs.launchpad.net/ubuntu/+source/grub-installer?field.searchtext=lenovo&search=Search&field.status%3Alist=NEW&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.assignee=&field.bug_reporter=&field.omit_dupes=on&field.has_patch=&field.has_no_package=
Comment 8 Nikato Muirhead 2019-06-07 05:36:54 CEST
It occurred to me that maybe Lenovo expects a 5 minute writing time from Microsoft. Maybe their Bios only fights the write for a couple of minutes, and for security reasons never confirms the write or confirms the write in a way that only Windows recognizes.  knowing that only Microsoft would keep hammering for 5 minutes.
Comment 9 Nikato Muirhead 2019-06-07 05:52:09 CEST
Instead of letting it write indefinitely waiting for no error, let it loop for 5 minutes, end the loop, and then proceed on the assumption of successful write.
Comment 10 Maurice Batey 2019-06-07 13:21:10 CEST
Interesting to see problem with Lenovo UEFI, as I have a UEFI/GPT Lenovo Thinkpad 11e with trouble-free rEFInd installed via 64-bit Plasma Classic .iso.

CC: (none) => maurice

Comment 11 Nikato Muirhead 2019-06-07 18:08:20 CEST
I'm not surprised. The higher end Lenovo are safe from these issues. Lenovo has made sure. It is the lower end models have been neglected.
Comment 12 Lewis Smith 2019-06-07 20:39:32 CEST
From comment 2:
> First I installed Magaia 7rc  on empty hard drive The boot loader did
> install even though it did not indicate that install was successful.
> ...
> tried going into the live DVD to see if I could  reinstall the REFIND
> bootloader from there.  Then I got this error. I get the same error when
> I install Magaia to an empty hard drive
This is puzzling: rEFInd was seemingly installed OK (if with no indication of same) on one virgin drive, but not on another.

> I had to force restart the computer
This implies that it did re-boot successfully. Did it?
It is known that the "click 'finish' to reboot" does not always work, but it is clear that the installation has gone to end. The root of this bug seems to be that you did not get that far.

> I installed Magaia 7rc
This rather implies the Classic ISO, but the information you attached for Martin implies that you used a Live ISO. I recently installed M7 Classic, choosing rEFInd bootloader (but not installing it, it was already there), and the installation went to end. I cannot remember for a Live ISO, but would have remarked a 'stuck' installation.

To see whether rEFInd has really been installed or not, the following commands can be used at any time from an installed Linux system:
 # fdisk -l /dev/sd...                   [to show partitioning, the ESP]
 # efibootmgr                            [to show EFI NVRAM]
 # ls -lR /boot/EFI/EFI                  [to show contents of ESP]
(cut any lengthy Microsoft & refind directory output).

@Martin: from the Live desktop after installtion?

> I  Then I shrank linux partition and installed windows
Did you try booting the machine (presumably via rEFInd) after shrinking the Mageia partition, before installing Windows?

Did you try installing Windows first (ideally leaving enough space for Mageia subsequently; or using Windows Disk Manager itself to free up enough space for Mageia - reboot Windows afterwards), then install Mageia in the free space?

After installing Windows after Mageia, did the machine boot directly to Windows? 

I do not see why any hardware peculiarity would cause a disc-related problem. Installing rEFInd involves a few normal disc writes to the EFI System Partiton, no reason for a block here. I do not understand your mentions of 5m wait; a progress bar would be irrelevant.
OTOH Installing to a virgin unpartitioned disc would entail creating the ESP along with the OS partitions. If any work, they all would.

> Installing Grub2 gave the exact same result.
More likely, the EFI firmware might be difficult about writing to the NVRAM,  essential for rEFInd and Grub2 as well as Windows.
You did not say whether you had Secure Boot, which should of course be DISabled for Mageia. If you really wanted to keep it with Windows, I imagine you would have to boot rEFInd via Windows.
--------------------------------
This might be useful for you to investigate what is happening:
 https://wiki.mageia.org/en/About_EFI_UEFI

CC: (none) => lewyssmith

Comment 13 Martin Whitaker 2019-06-07 21:32:28 CEST
Some tests in your installed system, all done as root:

1. Run

  efibootmgr -v

What output do you get?

2. Run

  efibootmgr -c

Does that hang? If so, kill it.

3. Again run

  efibootmgr -v

Is there now a new entry similar to this:

  Boot0004* Linux

(the four digit number is likely to be different).

4. If yes, run

  efibootmgr -b 0004 -B

replacing 0004 with the four digit number you see in step 3.
Comment 14 Nikato Muirhead 2019-06-07 22:15:47 CEST
If the root of the bug is that the click to reboot button never shows up then just work on that then. I can tell you for sure that there are bugs that prevent the bootloader from writing. I suppose those may never be addressed.
Comment 15 Martin Whitaker 2019-06-07 23:06:33 CEST
No, the root of the bug is whatever is causing the hang. "bugs that prevent the bootloader from writing" is not very precise. I'm trying to establish exactly where the problem lies, so we can decide if there's a sensible way to work around it. I don't think changing the installer to just ignore errors is an option.
Martin Whitaker 2019-06-08 12:39:35 CEST

Summary: Installer never confirms that (REFIND) bootloader install completed and the installer hangs. => Installer hangs at end of bootloader install step on some Lenovo machines (BIOS bug when writing to EFI NVRAM?)

Comment 16 Martin Whitaker 2019-06-08 12:47:58 CEST
Created attachment 11074 [details]
Script to test writing the rEFInd PreviousBoot NVRAM variable

Please also test the attached script by running as root:

  perl set-last-boot.pl

and copy/pasting the output here. Does this also hang?

Note You need to log in before you can comment on or make changes to this bug.