Bug 22552 - UEFI install/dracut acer Aspire7 fail; -> GRUB rescue mode - if security is enabled
Summary: UEFI install/dracut acer Aspire7 fail; -> GRUB rescue mode - if security is e...
Status: RESOLVED OLD
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal critical
Target Milestone: ---
Assignee: Mageia tools maintainers
QA Contact:
URL:
Whiteboard: MGA6TOO
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-10 14:20 CET by Morgan Leijström
Modified: 2020-08-16 23:37 CEST (History)
3 users (show)

See Also:
Source RPM: dracut
CVE:
Status comment:


Attachments
report.bug grabbed before reboot, compressed (362.32 KB, application/x-xz)
2018-02-10 14:21 CET, Morgan Leijström
Details
install log from cauldron netinstall before reboot, compressed (238.62 KB, application/x-xz)
2018-02-11 18:42 CET, Morgan Leijström
Details
install log, mga7 with UEFI and /boot on a USB stick (236.79 KB, application/x-xz)
2018-02-11 23:48 CET, Morgan Leijström
Details
Install log of mga7 putting all on SATA SSD (236.77 KB, application/x-xz)
2018-02-12 00:48 CET, Morgan Leijström
Details
Log from reinstalling, see #c22 (159.02 KB, application/x-xz)
2018-03-16 09:10 CET, Morgan Leijström
Details

Description Morgan Leijström 2018-02-10 14:20:37 CET
Description of problem:
Boot after install ends directly with message:

  Entering Rescue mode...
  grub rescue> _


__Version-Release number of selected component (if applicable):
Mageia 6 64 bit DVD install iso on USB stick

__How reproducible:
UEFI mode install.
( Install of Mageia 6 test OK using Legacy BIOS mode for install and use )

___Steps to Reproduce:
New laptop acer Aspire7 A717-71G-5269 (tested OK with preinstalled MSwin10)
 
Installing using UEFI mode fails;
First in "BIOS" select the installer on USB stick as legitimate, reboot
I boot it and Mageia installer correctly shows the selections for UEFI installs
Plain default installation (except i choose a light DE, irrelevant)
  chose to use whole disk, all default, automatic partitioning
  (before that i tried one round with LUKS + LVM, same grub message in the end)
  Adding and using online media during install 
  All seemed to go OK
  Put in another USB stick, Alt-F2 and issued "bug", but it failed.
    Mounted stick manually and grabbed log, attached
Reboot into "BIOS", told it installed grub is legitimate
Reboot -> grub error message, see above
Comment 1 Morgan Leijström 2018-02-10 14:21:41 CET
Created attachment 9974 [details]
report.bug grabbed before reboot, compressed
Comment 2 Morgan Leijström 2018-02-10 16:17:21 CET
I forgot to mention the drive is a NVMe PCIe SSD

I never fiddled with grub2 before... after a quick web search trying, typing here what i write and read in that screen:

grub rescue> set
fw_path=(hd0,gpt1)//EFI/mageia
prefix=(hd0,gpt2)/boot/grub2
root=hdo,gpt2
grub rescue> ls (hd0,gpt1)//EFI/mageia
error: unknown filesystem
grub rescue> ls (hd0,gpt1)
(hd0,gpt1): Filesystem is unknown        <<<<<---- is this the problem?
grub rescue> ls (hd0,gpt2)/boot
./ ../ EFI/ dracut/ grub2/ vmlinuz initrd.img System.map-4.14.16-desktop-1.mga6 config-4.14.16-desktop-1.mga6 symvers-4.14.16-desktop-1.mga6.xz vmlinuz-4.14.16-desktop-1.mga6 initrd-4.14.16-desktop-1.mga6.img
grub rescue> ls
(hd0) (hd0,gpt1) (hd0,gpt2) (hd0,gpt3) (hd0,gpt4)
grub rescue> ls (hd0,gpt3)
(hd0,gpt3): Filesystem is unknown
grub rescue> ls (hd0,gpt4)
(hd0,gpt4): Filesystem is ext2
grub rescue> ls (hd0,gpt4)/
./ ../ lost+found/ ettan/          (this is apparently /home with user "ettan")

I can not find a partition containing the root file system!   <<<<<<<<<<-----!
I guess it should be the unreadable (hd0,gpt1)
And the other unreadable is probably swap.

Summary: Mga6 UEFI install on Aspire7 fail; -> GRUB rescue mode => Mga6 UEFI install on Aspire7 fail; -> root fs unreadable by GRUB?!

Comment 3 Barry Jackson 2018-02-10 18:59:50 CET
Maybe NVMe related - there have been issues in the past with these drives.
e.g. Bug #17743

What does lsmod at the grub2 prompt show?
Does it show modules loaded for the file systems that are showing as unreadable above?

CC: (none) => zen25000

Comment 4 Morgan Leijström 2018-02-10 20:16:28 CET
Thank you for the quick reply

lsmod is unknown command - because we are in grub2 rescue mode...?

Would it be a good test to try using cauldron network install?

Another thing i could try is to plug in a standard SATA (not NVMe) SSD i happen to have spare, and there is an SATA slot in the laptop - and install to that to see if it works then?

Summary: Mga6 UEFI install on Aspire7 fail; -> root fs unreadable by GRUB?! => Mga6 UEFI install on NVMe (Aspire7) fail; -> root fs unreadable by GRUB?!

Comment 5 Morgan Leijström 2018-02-10 20:27:45 CET
Ah, now i see (I think...)
(hd0,gpt1) is the UEFI boot partition i presume?

/root is part of / as i made a default install... and that is (hd0,gpt2)/

grub rescue> ls (hd0,gpt2)/
./ ../ lost+found/ boot/ home/ dev/ etc/ mnt/ run/ var/ root/ proc/ sys/ usr/ lib lib64 sbin bin initrd/ media/ opt/ srv/ .dbus/ .cache/

Strange that "lib lib64 sbin bin" do not have trailing "/" !
Comment 6 Morgan Leijström 2018-02-11 15:54:17 CET
Attempted to try some set command and then launch normal mode, but it fails as

grub rescue> insmod normal
grub rescue> normal
Unknown command `normal'

Before that i also set root=(hdo,gpt2) as i noted it originally was missing parenthesis, but no difference.

I am out in the dark here.

Summary: Mga6 UEFI install on NVMe (Aspire7) fail; -> root fs unreadable by GRUB?! => Mga6 UEFI install on NVMe (Aspire7) fail; -> GRUB rescue mode

Comment 7 Barry Jackson 2018-02-11 17:37:17 CET
Grub2 is not managing to read the / partition fs at all so cannot load modules.
lsmod is also a grub2 command which should list all the grub2 modules loaded.
I think you will find that you will be able to install to another drive without issues.
Looks like Bug #17743 to me. cc-ing Thomas

CC: (none) => tmb

Marja Van Waes 2018-02-11 17:59:33 CET

Assignee: bugsquad => mageiatools
CC: (none) => marja11

Comment 8 Morgan Leijström 2018-02-11 18:18:17 CET
OK, thanks.

Isnt it strange i got no error at command "insmod normal" ?

Currently running an cauldron netinstall to see if it works better in this regard.

Next try is plugging in normal SATA

Then We need this Laptop in production so if nothing else comes up you want me to try within a few days, i will install it Legacy mode instead (not UEFI) as that seemed to work OK.
Comment 9 Morgan Leijström 2018-02-11 18:40:33 CET
Same result on current cauldron netinstall 64 bit nonfree:

Reboot (after telling bios that EFI/mageia/grub64 is safe) ends up in grub rescue mode, and command "set" return exactly the same, did not try more.

Whiteboard: (none) => MGA6TOO
Severity: normal => major
Version: 6 => Cauldron

Comment 10 Morgan Leijström 2018-02-11 18:42:22 CET
Created attachment 9983 [details]
install log from cauldron netinstall before reboot, compressed
Comment 11 Thomas Backlund 2018-02-11 18:57:01 CET
If grub2 cant read the files it needs, it's broken...

kernel/initrd is not involved at all at this point yet, so it cant do anything...

and according to the installer logs, kernel and installer has properly detected the nvme and its partitions...
Comment 12 Morgan Leijström 2018-02-11 19:22:15 CET
OK then... while i am at it:
 trying USB boot, then SATA boot, then i go back to Legacy...
This acer UEFI security handling is beta state, giving me the creeps already anyway

Now i put EFI boot partition on a mini USB memory plug -> same kind of fail
(Idea: possible usable workaround to have that plugged in always)
Result:
only difference is one set value (USB): fw_path=(hd0,gpt1)//EFI/mageia
Comment 13 Morgan Leijström 2018-02-11 23:48:55 CET
Created attachment 9984 [details]
install log, mga7 with UEFI and /boot on a USB stick

Another try: thought it could boot from USB, because the installer do...
Configured two partitions in a USB stick: UEFI boot, and /boot:

/boot/EFI on USB sda1 (vfat)
/boot on sda2 (ext4)           <<<<---- to boot installed system using USB /boot
/, swap, /home on the NVMe

  Result:

grub rescue> set
fw_path=(hd0,msdos1)//EFI/mageia
prefix=(hd0,msdos5)/grub2
root=hdo,msdos5
grub rescue> ls (hd0,msdos5)/
./ ../ EFI/ dracut/ grub2/ vmlinuz initrd.img System.map-4.14.18-desktop-2.mga7 config-4.14.18-desktop-2.mga7 symvers-4.14.18-desktop-2.mga7.xz vmlinuz-4.14.18-desktop-2.mga7 initrd-4.14.16-desktop-2.mga7.img
Comment 14 Morgan Leijström 2018-02-12 00:48:34 CET
Created attachment 9985 [details]
Install log of mga7 putting all on SATA SSD

Exactly the same result using SATA SSD instead of USB stick for /boot

AND also when i use the NVMe for nothing, only a SATA SSD for everything, using the installer defaults, msdos partition table. (The NVMe still connected, no partitions)

Now install is progressing an old 80GB mechanical SATA, report tomorrow.
Comment 15 Barry Jackson 2018-02-12 00:54:36 CET
There have been problems with some buggy UEFI BIOS that even with secure boot turned OFF, do not allow access to system files until they have been marked as 'trusted' with secure boot ON. After which secure boot may be turned OFF and the access continues to be allowed.
I added a small section to the Wiki, after I had a real struggle to install Mageia to a laptop belonging to a friend of mine.
https://wiki.mageia.org/en/EFI:_can_no_longer_boot_into_Mageia#Boot_displays_.22No_Bootable_Device.22_when_rebooting_after_a_fresh_install_of_Mageia
If you follow the link in that Wiki which goes to a Ubuntu site then you will see that someone has had success with another Acer laptop.
No idea if it's related to your problem.
Comment 16 Morgan Leijström 2018-02-12 02:12:18 CET
Thanks Barry, could very well be something fishy with the otherwise unpolished laptop firmware; Dialogs are not proof read, frantically blinking cursor, irregular editing methods, lack of explanations, deleted boot alternatives are not really deleted until reboot, etc, and now this.  It do have an option to skip security check, but i have always left it on, and in my tests instead always pointed out the new bootloaders as safe, rebooted and they passed check. (else, i get a warning sign)


Sigh. Did two more tests:
Same with installing to mechanical SATA drive.

So somehow installed grub2 do not work from neither NVMe gpt table, SATA SSD msdos table, SATA hard disk, nor USB stick.

But grub on the install medias do work (as USB sticks)

Reverted to install using Legacy boot mode, for further test of Cauldron initially.  Legacy boot mode test OK so far.

Hmmm... as a test - what if i just dd the install stick to the NVMe, just to see if the machine accept to boot on it there?  Another day.  I shuld maybe also test disabling security.   And do more web search.  If i have time.  Else i just let it boot Legacy style.

Summary: Mga6 UEFI install on NVMe (Aspire7) fail; -> GRUB rescue mode => UEFI install acer Aspire7 fail regardless of disk type; -> GRUB rescue mode

Comment 17 Morgan Leijström 2018-02-13 15:05:40 CET
How come the mga6 live iso boots on NVMe in this laptop, but not a normal install?

1) I copied the mga6 64 bit live xfce to the NVMe
  ( booted on system-rescue-cd.org in BIOS Legacy mode and used dd )

2) Reboot into "BIOS" settings, i set UEFI mode, reboot, back in i select the hd0 > EFI > BOOT > bootx64.efi as safe, reboot:

Result: Boots perfectly OK in UEFI mode.

In the in NVMe booted live mga6 system gparted displays "/dev/nvme0n1" (which is a "drive" not a partition, IIRC) as iso9660 238 GiB (as using all of the NVMe) - apparently gparted is wrong...

Diskdrake complains of broken table but correctly shows there is partition /dev/nvme0n1p1 1,8GB iso9660, and then /dev/nvme0n1p2 4MB EFI system partition.

So the laptop apparently boots using this UEFI partition, grub then launch the system successfully from iso9660 partition.  Interesting.



(Sidenote: I see the known acer bug: if i just disable security it stops with security alert anyway; i need to mark the .efi file safe, and for that i need to enable security and for that i need to define supervisor password.)
Comment 18 Morgan Leijström 2018-02-13 17:57:04 CET
Successfully installed Ubuntu 17.10 in UEFI mode
Booting OK in UEFI Secure boot mode - even though i did not need to select thet .efi safe in BIOS... did that installer do that - and maybe more - by itself?

Scary.
Comment 19 Morgan Leijström 2018-02-13 22:13:33 CET
Another test: I shrunk that Ubuntu / partition, and installed Cauldron (using same EFI partition as Ubuntu, a new swap, and a new ext4 for everything else. I can during boot press F12 and select Mageia, but screen just flickers a fraction of a second, then Ubuntu boot menu is shown.

Wiped it and tested also Fedora 27 Workstation installs OK.

So Fedora and Ubuntu installers do something which is needed on this machine, our installer do not.
Comment 20 Morgan Leijström 2018-02-14 14:41:41 CET
** I disabled UEFI security, and now the Mageia installed grub2 boots Mageia **

(Still UEFI enabled, used for both installer and installed system.)

  Summing it up:

§  Installed Fedora, Ubuntu, and also Mageia 6 xfce live iso dumped to the NVMe works with security enabled, but not the installed Mageia.

§  If i then enable security i get a warning sign from "BIOS" when i try to boot it. I can then go into BIOS and select the mageia .efi file as secure so it tries to boot it, but then it ends up like in comment #0.

Summary: UEFI install acer Aspire7 fail regardless of disk type; -> GRUB rescue mode => UEFI install acer Aspire7 fail; -> GRUB rescue mode - if security is enabled

Comment 21 Morgan Leijström 2018-03-15 23:44:06 CET
Oh, hell...!
I updated to kernel 4.14.25-desktop-1.mga6, and rebooted
-> black screen with grub propmt again.

( and this is my wifeś production machine. I did snapshot /, bot forgot EFI and /boot partition... learned now that i should have... ;) ) 

Checking in BIOS settings i see EFI Security is ON !
- What?! i thought I left it OFF. not 100% sure, could be a test i forgot...
Maybe because of my second-booted W10 updated itself to next major verison 1709, and decided it knew better than me about security versus reliability?

However, setting EFI security OFF now do not help.
Also not setting it ON, marking file as safe, setting it OFF: don't work.

-> rising to critical

Is this a dracut problem?

Tomorrow i will try to repair it with having EFI security OFF and run the installer, keeping / et al - just let it write grub.

Severity: major => critical
Source RPM: (none) => dracut
Summary: UEFI install acer Aspire7 fail; -> GRUB rescue mode - if security is enabled => UEFI install/dracut acer Aspire7 fail; -> GRUB rescue mode - if security is enabled
Component: Installer => RPM Packages

Comment 22 Morgan Leijström 2018-03-16 05:35:27 CET
___Trying to track it down:

I found that the EFI partition mageia/grubx64.efi was still the old version from last month.  In my /boot partition i found the new version that seem to have been generated when installing the new kernel.

One odd thing: when booting to the grub prompt, the first line say unrecognised filesystem type or similar.  (sorry i am a bit stressed, forgot to take photo...)
So we may have two bugs here...


Bug fixer can stop reading this post now;
Noting my repair work below for own documentation and it may help others.
-------------------------------------------------------------------------------
First, I verified that BIOS is set to UEFI but with security off. Then:


___Repair attempt 1:

My first attempt at it was to copy that new grubx64.efi from /boot partition to the EFI partition.  But that did not change anything.


___Repair attempt 2:

Boot installer to repair boot
-> it can not handle LVM on LUKS and i have no idea on how to help it


___Repair attempt 3:

Boot installer to perform "installation" reusing all partitions without formatting anything, not installing, just let it set up grub2 in the end.
-> Bug 22782 - Installer/urpmi get stuck in loop evaluating already installed packages


___Repair attempt 4:

Fresh install reusing partitions !!careful about /home data!!
Momentarily mounted /boot/EFI and removed the EFI/mageia folder just in case.
(mounted using diskdrake, Ctrl-Alt-F2 to terminal, back and unmounted)
DO NOT format /home nor /boot/EFI nor Windows
Format / and /boot and swap
This time did not use any online media at beginning of install, but let it update in the end.
-> Works. Phew.  Just a bit configuration to do.
Comment 23 Morgan Leijström 2018-03-16 09:10:47 CET
Created attachment 10049 [details]
Log from reinstalling, see #c22

Here is the log from reinstalling, successful attempt in comment #c22

I searched it through trying to find problems around grub2:


§ In the beginning it wants to run but can not find grub2-editenv
 - is this missing from the installer?  Not needed?
 It is only tried once during install - not in the end when kernel is updated
 I see it is found when updating kernel later on running system.


§ update-grub2 logs: mesg: ttyname failed:_bad "ioctl" for unit
 I see same message in journal when i later update the kernel on running system.


§ sda1 is the installer USB stick, so i guess this message is OK:
 grub2-probe: fel: kan inte hitta en GRUB-enhet för /dev/sda1.  Kontroller din enhetskarta.


Excerpt:
---8<-----

* starting step `installPackages'
* running: grub2-editenv list with root /mnt
* program not found: grub2-editenv
* writing /mnt/etc/fstab
...
...
* running: update-grub2  with root /mnt
* update-grub2 logs: mesg: ttyname misslyckades: Olämplig "ioctl" för enhet
Generera konfigurationsfil för grub …
File descriptor 13 (/root/drakx/install.log) leaked on vgs invocation. Parent PID 25716: /usr/sbin/grub2-probe
...
File descriptor 52 (/root/drakx/install.log) leaked on vgs invocation. Parent PID 25844: /usr/sbin/grub2-probe
Hittade tema: /boot/grub2/themes/maggy/theme.txt
File descriptor 13 (/root/drakx/install.log) leaked on vgs invocation. Parent PID 26094: /usr/sbin/grub2-probe
...
File descriptor 52 (/root/drakx/install.log) leaked on vgs invocation. Parent PID 26094: /usr/sbin/grub2-probe
Hittade linux-avbildning: /boot/vmlinuz-4.9.35-desktop-1.mga6
Hittade initrd-avbildning: /boot/initrd-4.9.35-desktop-1.mga6.img
File descriptor 13 (/root/drakx/install.log) leaked on vgs invocation. Parent PID 26373: /usr/sbin/grub2-probe
...
File descriptor 52 (/root/drakx/install.log) leaked on vgs invocation. Parent PID 26373: /usr/sbin/grub2-probe
grub2-probe: fel: kan inte hitta en GRUB-enhet för /dev/sda1.  Kontroller din enhetskarta.
Hittade Windows Boot Manager på /dev/nvme0n1p1@/EFI/Microsoft/Boot/bootmgfw.efi
färdigt

* running: grub2-set-default linux with root /mnt
* running: sh /boot/grub2/install.sh with root /mnt
* step "summary" took: 0:02:51
Comment 24 Morgan Leijström 2018-03-16 10:12:38 CET
(In reply to Morgan Leijström from comment #22)

> I found that the EFI partition mageia/grubx64.efi was still the old version
> from last month.  In my /boot partition i found the new version that seem to
> have been generated when installing the new kernel.

That was as seen when booted on another linux system on USB stick.

The EFI and /boot are two separate partitions.

When i mounted the partitions that in mageia is mounted as /boot, i saw it contained /EFI and in there the mageia/grubx64.efi

Already there it is clear sometning is wrong, because it should be empty, because the efi partition should be mounted /boot/EFI

So it seems the reason the new grubx64.efi never got written to EFI partition was that it was not mounted when i updated kernel in comment #21.  Why was it not mounted??
(sorry i can not retrieve logs from that session as i ended up reinstalling over system / due to the urpmi bug and was in a hurry)

---

But besides that i cannot see why it did not work to copy the new grubx64.efi to the correct partition.  And why the grub message about wrong filesystem type...?

---

Now i remember why i booted Windows a couple weeks ago:
It was to update BIOS firmware;
Acer only deliver it as a program for windows 10 :(  #%#@€¥{€$!!
That is the reason i dual boot the machine.
Windows updated itself, i let it reboot, and i had it update BIOS firmware.
Probably *that* got security re-enabled without me knowing.

I wonder if something of all that made Mageia unable to mount EFI partition.

But I guess Mageia should stop booting if it can mount EFI, right?

Maybe MS/acer changed filesystem type definition slightly?
I am thinking if it exist similar compatible but silhtly different definitions vfat / FAT32 ...?  I see the type stated in fstab now is vfat.
But i thought i created it as FAT32 once upon a time... or did i not?
(i installed W10 after my previous mageia install)
FAT32 is a kind of vfat...?  Both names can be used?
different codes?  Not my cup of tea.
Such could be the reason for the grub filesystem type message.

---

I have now updated to kernel 4.14.25-desktop-1.mga6 from testing,
and rebooted, all OK.

I see EFI partition is mounted.

BTW, drive is a NVMe PCIe SSD, partitioned:
§ EFI (/boot/EFI)
§ Windows10
§ encrypted pv for LVM: /, /home, swap, snapshots
§ /boot
With lots of free space on all partitions.


Now on to Plasma5.12, applications, configurations,
and the machine have to enter production!
Comment 25 Morgan Leijström 2020-08-16 23:37:14 CEST
I have forgot to update this bug.
System is now running mga7, since mga7 got released.
I think i made a fresh install. 
I knew it was a mined area so i know i only concentrated on gettin it running, reusing partitions.
That is my wifes production machine.

Closing as old for now.

Status: NEW => RESOLVED
Resolution: (none) => OLD


Note You need to log in before you can comment on or make changes to this bug.