The latest version of dracut uses information from the udev metadata to determine if certain modules are needed. This included detecting lvm, raid and some btrfs stuff too. This presents a chicken and egg scenario for upgrades. As our previous mkinitrd system started LVM and RAID during early boot before udev was started, various bits of metadata never make it into udev. Because of this, if you take a Mageia 1 instance (booted with mkinitrd) and then generate a dracut initramfs, it will NOT include the needed LVM or RAID support. Previous versions of dracut did not use udevadm and used more direct probing mechanisms and thus did not suffer from these problems.
Status: NEW => ASSIGNED
Priority: Normal => release_blocker
I encounter the same problem - no LVM on early boot - with a _fresh_ install of Mageia 2 beta 1 from DVD. The mapping of partitions is : sda1 /boot lvm / lvm /home lvm swap The symptom is that, at boot after a successfull installation, dracut falls to the debug shell because it can't mount /. No lvm command is available from this debug shell. I wonder if the creation of the dracut initramfs during a fresh installation suffers from the same problem than from for the upgrade from mga1 to mga2 you mention, Colin.
CC: (none) => farfouille64
When sorting out how to fix this bug, keep in mind that rescue cds may be old, and use devfs rather then udev. In my opinion, the none of the various module.setup scripts should assume udev is running. My primary "rescue cd" is an old knoppix 5 cd that uses devfs.
CC: (none) => davidwhodgins
OK, so the upstream recommendation is to generate a non-hostonly initramfs in these situations. This should be a far bigger (about 25megs in my tests!) but it should be a safer option. I've added a hack that detects whether the current boot is from dracut, (via the existence of the /run/initramfs folder - if it exists, we've booted with dracut), and will disable hostonly mode even if it's specified in the config or via command line. There is an env var to override this behaviour should it be not required by e.g. the installer (as opposed to a "live" urpmi upgrade).
@François, I would have thought that the installer should work without this issue, so perhaps there was another problem to blame (although I wouldn't rule out udev not having the necessary metadata about newly created LVMs). Can I ask, did you reuse an existing LVM in your install or did you repartition (I know it was a fresh install, but reusing+formatting existing drives could still be considered "fresh installs" I guess).
The LVM already existed before the fresh installation. During the installation, at the disk preference screen, I choosed the last option (customized I think), then I reused existing partitions : sda1 /boot format:yes lvm / format:yes lvm /home format:no lvm swap format:NA AFAIK I didn't build the LVM during the installation process (I even don't know if it is possible) There is also 2 NTFS partitions at the bottom of the HD (Windows7) that I have ignored during the mga2b1 installation. My intent is to install mga2 on this computer when it will be released. Meanwhile, if you need me to test something don't hesitate, I can reformat or reinstall it.
@François: I think there could be a subtle difference between two different paths during installation here. I think that if the LVM were created at install time, things would work OK (data would get into udev database) with a hostonly initrd, but if they exist already and are reused, it will not (due to the way we boot, the LVM are activated before udev is run and thus the metadata isn't available via udev database) As things stand right now (with cauldron) the installer should now produce a non-hostonly initrd during install. It'll be larger, but it should be safer. After a proper boot, a more streamlined initrd could be generated (sudo dracut -f). When the kernel is next updated, a new hostonly (smaller) initrd will also be produced for it. This could be enough of a solution to close this bug, but I'll leave it open for testing etc.
@Colin: I've tested again with bare mga2b2 DVD and it is ok ; lvm partitions are detected at reboot (after install) by dracut. Thank you
So can we close this bug?
CC: (none) => sander.lepik
I'd like to keep it open for now. There is still a question as to whether or not the installer can generate a hostonly initrd. I need to do some testing on various setups firs tho'. Hopefully I'll get round to that this weekend.
It still doesn't run the "lvm vgchange -a y" command before trying to mount /usr, when using a non-hostonly initrd, which results in dropping to an un-useable shell after pivot. The system has to be rebooted with the rdbreak=pre-pivot, and the command run manually.
CC: (none) => ennael1Blocks: (none) => 4298
Just to update this bug with some progress: 1. Installer will now generate a hostonly initrd. 2. I have added a change to dracut and am about to commit a change to installer that ensures that any swap partitions on LVM etc are properly activated. The issue of the lvm vgchange not being run on a non-hostonly image is still present, but I'll try and look at that next (as it will affect urpmi upgrades)
OK, I have just installed a test system. sda1: / ext4 sda5: LVM (vg-mga) vg-mga/swap: swap vg-mga/usr: /usr ext4 The hostonly initrd generated in the installer worked fine and allowed smooth boot. (Note it was modified such that installer dropped a config file to activate the swap properly - patch submitted but if you try and reproduce this test you may get a timeout waiting for the swap. You can ignore this problem or drop the resume= command line entry). I regenerated a non-hostonly initrd and rebooted. It also worked fine. With all this in mind, have I missed any remaining problems or is this all finally handled?
Didn't work on my system. With / on a regular partition, and /usr on a lvm logical volume, the non-hostonly initrd generated when I installed the latest update in a chroot from Magea 1 still fails to activate the logical volume. Since the device specified for /usr in /etc/fstab is not found, dracut skips trying to mount /usr, and then locks up after the root pivot. I had to use rd.break=pre-pivot and run the "lvm vgchange -a y" command. grep usr /etc/fstab /dev/mapper/91-usr /usr ext4 defaults,relatime,user_xattr 1 2
That's interesting as I tested this exact setup. I'll take another look as I must have mis-tested :s
Oh sorry I misread. I was thinking the installer... yeah this is still a bit on an issue on upgrade. Still looking at it.
OK, I just installed a VM of Mageia 1 with the following layout: / ext4 /usr LVM + btrfs swap LVM + swap I did a urpmi-based upgrade. I installed rpm-helper first and then just did a urpmi --auto-select --auto for the rest of the upgrade. Dracut generated a non-hostonly initrd as expected during this install and I rebooted and it all worked happily. I then regenerated the host-only initrd after booting and it also worked happily. I am about to reset the snapshot and try an installer based upgrade which should generate a hostonly initrd.
OK, I have now done an installer based upgrade. It generated a hostonly initrd as expected which worked well. I'm not sure what more I can do here as all my tests have passed.
I just installed all updates, and ran dracut in a chroot, and confirmed it still fails to activate the lvm volume groups. There are two problems here. First not activating the volume groups, which in my opinion, would best be fixed with testing if lvm is in the initrd, in which case run the lvm vgchange -a y before trying to mount anything. Second problem, is that if a needed filesystem such as /usr, is in /etc/fstab, and the device is not found, instead of ignoring it, it should drop to an emergency shell, rather then trying to pivot to a root where it panics due to /usr not being mounted.
Ah. Just noticed. My swap is not on lvm. If swap is on lvm, that may be why the volumes are activated for you, but not for me.
(In reply to comment #19) > Ah. Just noticed. My swap is not on lvm. If swap is on lvm, that may > be why the volumes are activated for you, but not for me. That could indeed be the difference (although I'm pretty certain that I checked and there were no cmdline.d files in my non-hostonly initrd so that would rule that out). I'll go and do yet more tests.
OK, while I'm not 100% sure why, I was able to reproduce the problem when I removed the swap partition. I've added a patch that, in non-hostonly mode, ensures that the lvm_scan script is run which happens during the check_finished function at the start of the initqueue processing. http://svnweb.mageia.org/packages/cauldron/dracut/current/SOURCES/0511-lvm-Ensure-LVM-is-initialised-in-non-hostonly-mode.patch?revision=234111&view=markup I also agree that it should give you a shell if it cannot mount /usr, I'll see if I can do this.
Created attachment 2131 [details] dmesg from non-hostonly initrd boot of cauldron As shown by the attached dmesg, /usr is still not getting mounted prior to the root pivot, however, as bug 4372 has now been fixed, it is mounted ok, by systemd after the root pivot. As far as I can see, this isn't causing any problems though.
Yeah I messed up the previous patch :s I finally worked out why this is all so confusing and why I couldn't reproduce in my tests. It's technically nothing to do with swap on LVM or anything it's to do with the lack of a resume= argument when you boot. If you had a resume=/dev/foo (doesn't really matter what), then LVM would be activated and mounted. This is ultimately because a call to the function wait_for_dev is called which ensures that the whole initqueue is run etc. So my working fix http://svnweb.mageia.org/packages/cauldron/dracut/current/SOURCES/0511-lvm-Ensure-LVM-is-initialised-in-non-hostonly-mode.patch?revision=234159&view=markup simply waits for a fake device and then cancels the wait after a timeout. There may very well be neater ways to achieve the same result, but I'm tied and I just want to fix this bug now :p Worked on my test machine without any resume= argument, so I'm optimistically going to close this bug now! If you disagree, please reopen with any extra info you can give :D Thanks for your patience.
Status: ASSIGNED => RESOLVEDResolution: (none) => FIXED
Hi, this bug is not fixed. it can be reproduced easily. create an virtual machine(mine is Fedora 16, kvm, x86_64) and do a fresh install with mageia 2 DVD. partition the hardisk with the following schema: /boot ext4 500M / vg_system/lv_root 7000M swap vg_system/lv_swap 1024M then after installation finished successfully, reboot and the get the error: ~~~ dracut Warning: Cancelling resume operation. Device not found. udev[79]: sender uid=-1, message ignored. dracut Warning: Unable to process initqueue dracut Warning: "/dev/vg_system/lv_root" does not exist Dropping to debug shell. sh: 0: can't access tty; job control turned off dracut:/# ~~~ i've tried remove resume=<path>, rhgb, quiet parameters and boot, failed the same. try remove the above parameters, and add rd_LVM_LV=vg_system/lv_root rd_LVM_LV=vg_system/lv_swap , then boot, also failed. does the lvm enabled in kernel configuration? BR, Charles
Status: RESOLVED => REOPENEDCC: (none) => phoenix.guoResolution: FIXED => (none)
@Charles: I'm pretty sure this is not actually this bug specifically, but rather another bug in the mageia-theme package which generates the initrd when it is installed (which is before several other essential packages are installed) and thus the initrd is generated without all the necessary stuff (i.e. the lvm command). The quick work around is when the install has finished and you're being prompted for the root password, switch to the tty, and do a "rm -f /mnt/boot/initrd-3*.img", then flip back to the graphical display and complete the installation. I've got an updated theme package that will hopefully be pushed as an update soon. Sadly this doesn't really help with the DVD installs unless you are connected to the 'net when installing and thus can install the "update" as the first package. Please let me know if this work around works for you.
Hi Colin, Thanks for your explanation. I have confirmed that it is the initramfs' problem. Here I have a work around for those who do not want their system to be reinstalled. Procedure: ---------- 1. Boot with your Installation DVD and enter rescue mode, by selecting the "Rescue System" in the boot menu. 2. In the 'choose action' screen, select "Mount your partitions under /mnt" item, and select 'Ok'. 3. You will see informations on system mounting operation. When it prompts "<Press Enter to return to Rescue menu>", press Enter. 4. Select "Go to console" item and select 'Ok', and you will be dropped to a shell. 5. Make /mnt as root directory: #chroot /mnt 6. Change directory to /boot #cd /boot 7. Rename the old initrd file: #mv initrd-3.3.6-desktop-2.mga2.img initrd-3.3.6-desktop-2.mga2.img.bad 8. Rebuild the initramfs: #dracut initrd-$(uname -r).img $(uname -r) 9. Exit chroot #exit 10. Reboot system and see if it works. #reboot I also have some screenshots for this rescue guide, but I don't know how to host them as I'm a newbie here. Hope this can help the others. BTW, does mageia needs tester? How to apply? BR, Charls
That doesn'work; I've just made the operations describded in comment 26, about a mageia2.i586: sda1 /boot vgmageia/lvrootmga32: / vgmageia/lvusrmga32: /usr vgmageia/lvhomemga32: /home I wished to rename the lv's by replacing "mga32" by "mga2-32". I have chrooted from a fedora16 32 bits and made exactly the same operations as in comment 26, using dracut. I couldn't reboot my mageia: the splash gave: ".../dev/mageia/lvrootmga2-32 doesn't exist..." and so on. Best regards Alder
CC: (none) => alainderaedt
Any status about that bug?
As stated above, I'd *really* rather not use this bug. It's old and was for mga2 and whether this is the same or a similar problem or not, I'd much prefer to use a new bug for mga3 for tracking purposes. So closing this bug. If it is still an issue please reopen it and make sure it *doesn't* block the same bug as this one which was an MGA2 release tracking bug. Resolving as FIXED because it was fixed in mga2.
Resolution: (none) => FIXEDStatus: REOPENED => RESOLVED