Bug 5857

Summary: drakboot unable to attribute correct root entry in grub (menu.lst) when using LVM
Product: Mageia Reporter: Simple <simplew8>
Component: RPM PackagesAssignee: Mageia Bug Squad <bugsquad>
Status: RESOLVED INVALID QA Contact:
Severity: critical    
Priority: Normal CC: alien, mageia, pterjan, thierry.vignaud
Version: Cauldron   
Target Milestone: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Source RPM: drakxtools-curses-14.20-1.mga2 CVE:
Status comment:

Description Simple 2012-05-11 18:45:03 CEST
Theme name: Adwaita
Kernel version = 3.3.5-desktop-1.mga2
Distribution=Mageia release 2 (Cauldron) for x86_64
CPU=Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz

drakboot fails to correctly attribute correct root partition to grub
I did lost the grub configuration, so i run "drakboot --boot" so recreate the configuration and the entries were added.
Then i reboot and the boot was stuck in a dracut prompt, so i reboot and this time took a look to the grub entry in which i saw root=/dev   where it should be root=/dev/mga/root  (thats the lvm root group).

Im pasting here what drakboot first created in /etc/grub/menu.lst:


timeout 10
color black/cyan yellow/cyan
gfxmenu (hd0,4)/gfxmenu
default 0

title linux
kernel (hd0,4)/vmlinuz BOOT_IMAGE=linux root=/dev nokmsboot resume=/dev/mga/swap
initrd (hd0,4)/initrd.img

title desktop 3.3.4-1.mga2
kernel (hd0,4)/vmlinuz-3.3.4-desktop-1.mga2 BOOT_IMAGE=desktop_3.3.4-1.mga2 root=/dev nokmsboot resume=/dev/mga/swap
initrd (hd0,4)/initrd-3.3.4-desktop-1.mga2.img

title failsafe
kernel (hd0,4)/vmlinuz BOOT_IMAGE=failsafe root=/dev nokmsboot failsafe
initrd (hd0,4)/initrd.img

title windows
root (hd0,0)
makeactive
chainloader +1
Simple 2012-05-11 18:45:59 CEST

CC: (none) => thierry.vignaud

Comment 1 Simple 2012-05-11 19:35:31 CEST
In fact the correct entry should be some like "root=/dev/mapper/mga-root"  instead "root=/dev/mga/root" like suggested in previous comment.
Manuel Hiebel 2012-05-14 16:21:34 CEST

CC: (none) => pterjan

Comment 2 Simple 2012-05-14 21:57:48 CEST
This is a critival problem since when installing a new kernel it will not create a correct entry for root.

Due to mageia 2 installer not allow to use existant LVM encrypted partitions (but this is a different problem that needs to be reported), i had to do a new install and now when installing kernel 3.3.5 were create wrong entries in grub menu.lst.

Im pasting here menu.lst contents so that can be viewed what happened:


timeout 10
color black/cyan yellow/cyan
gfxmenu (hd0,4)/gfxmenu
default 0

title linux
kernel (hd0,4)/vmlinuz BOOT_IMAGE=linux root=/dev/ splash quiet resume=/dev/insys/swap vga=788
initrd (hd0,4)/initrd.img

title linux-nonfb
kernel (hd0,4)/vmlinuz BOOT_IMAGE=linux-nonfb root=/dev/insys/root resume=/dev/insys/swap
initrd (hd0,4)/initrd.img

title failsafe
kernel (hd0,4)/vmlinuz BOOT_IMAGE=failsafe root=/dev/insys/root failsafe
initrd (hd0,4)/initrd.img

title windows
root (hd0,0)
makeactive
chainloader +1

title desktop 3.3.4-1.mga2
kernel (hd0,4)/vmlinuz-3.3.4-desktop-1.mga2 BOOT_IMAGE=desktop_3.3.4-1.mga2 root=/dev/insys/root splash quiet resume=/dev/insys/swap vga=788
initrd (hd0,4)/initrd-3.3.4-desktop-1.mga2.img

title desktop 3.3.6-1.mga2
kernel (hd0,4)/vmlinuz-3.3.6-desktop-1.mga2 BOOT_IMAGE=desktop_3.3.6-1.mga2 root=/dev/ splash quiet resume=/dev/insys/swap vga=788
initrd (hd0,4)/initrd-3.3.6-desktop-1.mga2.img


the new entries "linux" and "desktop 3.3.6-1.mga2" are pointing to /dev, the other entries are correct because i did fixed them before install kernel 3.3.5.
This way is clear that Mageia will NOT BOOT unless the user uses the other entries.

Severity: normal => critical

Comment 3 AL13N 2012-05-15 09:14:10 CEST
likely, /dev/insys/root should be good, but during our irc session, it seemed that maybe due to encryption at PV level, the lvm wasn't initialized.

IIRC the steps he used to complete this at the rescue level were:

1. [decryption]
2. pvscan
3. vgscan
4. lvscan
5. vgchange -ay
6. vgmknodes
7. now the devices were available

maybe dracut & installer needs to be patched to allow for this kind of setup?

CC: (none) => alien

Simple 2012-05-15 15:17:58 CEST

Summary: drakboot unable to attribute correct root entry when using LVM => drakboot unable to attribute correct root entry in grub (menu.lst) when using LVM

Comment 4 Simple 2012-05-18 13:42:15 CEST
I forgot to refer that the LVM is encrypted, and that seams to be the problem root.
Comment 5 Colin Guthrie 2012-05-18 14:32:40 CEST
FWIW, dracut likely doesn't need any patching to solve this. I would imagine that dracut will still enable the drives (both crypt and lvm) and that all that would be required to boot would be to edit the command line from grub - no need to use a rescue system etc.

I could, of course, be wrong, but I suspect the bits that detect the root for the grub menu stuff and the way dracut works are quite different.

I should have a VM at home with a similar setup (encrypted lvm) so I should be able to take a look to see if I can reproduce at some point over the weekend.

CC: (none) => mageia

Comment 6 Colin Guthrie 2012-05-19 14:47:07 CEST
OK, I just installed a test system in a VM with a /boot on ext4 and an encrypted LVM for / and swap.

I manually downgraded to an old kernel so I could test upgrades.

I took a VM snapshot and removed my menu.lst and ran drakboot --boot. It showed that there were no entries (obviously) so I added a new one. It correctly guessed my root partition as being /dev/vg-mga/root so I just had to fill in the vmlinuz and initrd.img fields and all was well. it wrote a valid menu.lst and I rebooted to confirm it worked fine.

I then upgrades the kernel via urpmi --auto-update and it too worked without any problems.

This was running drakboot via a console, so I'm now going to test drakboot via an X11 session.
Comment 7 Colin Guthrie 2012-05-19 15:15:15 CEST
OK, tried it via a graphical boot too and no problems there.

So I cannot reproduce at all.

Can you please try and reproduce this problem in a VM and include all steps to be able to reproduce including exactly what to do to "lose the grub config" initially (simply rm'ing produces an empty config which you then have to manually add all the options includeing the nonfb, failsafe and named kernel options which I really suspect you did not do - perhaps you copied a menu.lst from another machine and then ran drakboot? This might remove the invalid device from root= and you didn't then go into each option to set it correctly? Any number of things could have gone wrong here).

I suspect that it was this step that was the problem case rather than anything else and I'm not sure that we can develop tools that protect against user's own changes here in this regard.

Anyway, as I've now spent several hours trying and failing to reproduce, if you can work out the *exact* steps to reproduce cleanly in a VM, please reopen, but until we can reproduce the problem, there is nothing we can do.

Status: NEW => RESOLVED
Resolution: (none) => INVALID

Comment 8 Simple 2012-05-19 17:15:59 CEST
I never said this was in a VM, so using a VM is not reproducing this bug.
All what i have described was done several times to be sure it was not an isolated case.

This clearly shows that there are important differences between running in a VM that installed directly in the HD.

Status: RESOLVED => REOPENED
Resolution: INVALID => (none)

Comment 9 Simple 2012-05-19 17:22:55 CEST
In the first comment i forgot an important detail, its an *encrypted* LVM that i refered in comment #4.
Did you encrypted the LVM and then created the groups in it?

But i will install in a VM and reproduce the bug.
Comment 10 Colin Guthrie 2012-05-20 12:07:11 CEST
Yeah I know it is encrypted - we did discuss this on IRC remember! As I stated in my comment when I closed this bug, I set up such an encrypted LVM system in order to try and replicate the problem, but no matter what I tried, I could not reproduce the problem you encountered.

Until there is a clear way for us to replicate the problem this bug is sadly not of any use. When you have a clear set of instructions to reproduce the problem from a fresh install, please reopen it. Until then please keep it closed as with the information and details it currently contains it's not really a valid bug report.

For absolute clarity, I don't doubt there is a problem somewhere, but it may be related to manual intervention rather than through the use of our tools. So the exact steps to reproduce are important here. The fact that I'm closing this bug is nothing personal - just normal bug management.

I hope you manage to find the exact set of steps to reproduce, and I look forward to helping find these and ultimately solving the problem.

Status: REOPENED => RESOLVED
Resolution: (none) => INVALID

Comment 11 Simple 2012-06-10 10:36:48 CEST
This is what i did:

Installed Mageia 2, during the installation, in partitioning i did created a LVM, created VG and LV's, finished the install and rebooted.

Updated to cauldron, and when installing kernel-desktop-3.4.2-1.mga3-1-1.mga3 it created these invalid entries in /boot/grub/menu.lst:

title linux
kernel (hd0,4)/vmlinuz BOOT_IMAGE=linux root=/dev/ splash quiet resume=/dev/insys/swap vga=788
initrd (hd0,4)/initrd.img

title desktop 3.4.2-1.mga3
kernel (hd0,4)/vmlinuz-3.4.2-desktop-1.mga3 BOOT_IMAGE=desktop_3.4.2-1.mga3 root=/dev/ splash quiet resume=/dev/insys/swap vga=788
initrd (hd0,4)/initrd-3.4.2-desktop-1.mga3.img

as we see theres root=/dev/  instead root=/dev/insys/root and this makes it impossible to boot to this kernel.


As explained in bug #5780 diskdrake changed the entries in /etc/fstab from:

/dev/insys/root / btrfs noatime 1 1

to:

/dev/mapper/insys-root / btrfs noatime 1 1

and this happens if change any option in ANY partition.
Though dont know id this can be related but i thought would be better to refer it.

Coling could you now try follow these regular steps?
But if for some reason this doesnt happen (which i doubt), please keep the install so that we can do some tests.
Comment 12 Simple 2012-06-14 03:22:07 CEST
This bug can safely be closed as invalid.