| Summary: | after a successfull upgrade from mageia7 to mageia8, the system can't boot | ||
|---|---|---|---|
| Product: | Mageia | Reporter: | peter lawford <petlaw726> |
| Component: | Installer | Assignee: | Mageia Bug Squad <bugsquad> |
| Status: | RESOLVED OLD | QA Contact: | |
| Severity: | major | ||
| Priority: | Normal | CC: | davidwhodgins, fri, kernel, ouaurelien |
| Version: | 8 | ||
| Target Milestone: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Source RPM: | CVE: | ||
| Status comment: | |||
| Attachments: |
screenshot
other screenshot as required return of lsinitrd /boot/initrd-5.10.20-desktop-2.mga8.img return of lsinitrd /boot/initrd-5.10.27-server-1.mga8.img (dracut) lsinitrd /boot/initrd-5.10.27-desktop-1.mga8.img (not dracut) |
||
Created attachment 12459 [details]
screenshot
Created attachment 12460 [details]
other screenshot
We've done many tests that have worked, and some where they failed as noted in https://bugs.mageia.org/showdependencytree.cgi?id=28393&hide_resolved=1 which are being worked on. At that screen, please login using the root password, then run "journalctl --no-hostname -b|grep -v 'audit:>journal.txt", and attach that journal.txt file to this bug report. CC:
(none) =>
davidwhodgins Daves keyboard neglected to type one '. Correct: journalctl --no-hostname -b|grep -v 'audit:'>journal.txt ( i.e journal from current boot without some details ) CC:
(none) =>
fri Created attachment 12461 [details]
as required
(In reply to Morgan Leijström from comment #4) > Daves keyboard neglected to type one '. Correct: > > journalctl --no-hostname -b|grep -v 'audit:'>journal.txt > > ( i.e journal from current boot without some details ) here attached journal.txt Comment on attachment 12461 [details]
as required
mars 14 13:14:23 systemd[1]: dev-vgmageia-lvhomemga6\x2d64.device: Job dev-vgmageia-lvhomemga6\x2d64.device/start timed out.
mars 14 13:14:23 systemd[1]: Timed out waiting for device /dev/vgmageia/lvhomemga6-64.
mars 14 13:14:23 systemd[1]: Dependency failed for /home.
mars 14 13:14:23 systemd[1]: Dependency failed for Local File Systems.
mars 14 13:14:23 systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
mars 14 13:14:23 systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
Above are relevent lines. Thanks reporting this.
The system can't find your /home partition on /dev/vgmageia/lvhomemga6-64.
Also, there is this:
mars 14 13:13:02 kernel: EDAC sbridge: Seeking for: PCI ID 8086:6faf
mars 14 13:13:02 kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled
mars 14 13:13:02 kernel: EDAC sbridge: Couldn't find mci handler
mars 14 13:13:02 kernel: EDAC sbridge: Failed to register device with error -19.
Seems kernel can't find a particular hardware.
Does a fully updated Mageia 7 system worked well?CC:
(none) =>
ouaurelien Adding kernel team to cc list due to Error Detection and Correction errors. CC:
(none) =>
kernel May be interesting to try to boot a Live USB. Make it with persistence, so kernel etc can be updated. Note that rebooting a live iso with persistence after updating the kernel will not use the updated kernel as the persistence file system is not opened until after the kernel has started. (In reply to Aurelien Oudelet from comment #7) > Comment on attachment 12461 [details] > as required > > mars 14 13:14:23 systemd[1]: dev-vgmageia-lvhomemga6\x2d64.device: Job > dev-vgmageia-lvhomemga6\x2d64.device/start timed out. > mars 14 13:14:23 systemd[1]: Timed out waiting for device > /dev/vgmageia/lvhomemga6-64. > mars 14 13:14:23 systemd[1]: Dependency failed for /home. > mars 14 13:14:23 systemd[1]: Dependency failed for Local File Systems. > mars 14 13:14:23 systemd[1]: local-fs.target: Job local-fs.target/start > failed with result 'dependency'. > mars 14 13:14:23 systemd[1]: local-fs.target: Triggering OnFailure= > dependencies. > > Above are relevent lines. Thanks reporting this. > The system can't find your /home partition on /dev/vgmageia/lvhomemga6-64. > > Also, there is this: > mars 14 13:13:02 kernel: EDAC sbridge: Seeking for: PCI ID 8086:6faf > mars 14 13:13:02 kernel: EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has > DIMMs, but ECC is disabled > mars 14 13:13:02 kernel: EDAC sbridge: Couldn't find mci handler > mars 14 13:13:02 kernel: EDAC sbridge: Failed to register device with error > -19. > > Seems kernel can't find a particular hardware. > > Does a fully updated Mageia 7 system worked well? yes, but it needs time (about 1mn15s) to find /home partition; in fact, my systems ran for 15 years on an old stuff (mobo Gigabyte chipset Intel X58, socket LGA1366, cpu core i7 960, graphic nvidia 9800gtx+), I recently changed my stuff, which is more modern, but not up-to-date: mobo Asus ROG, chipset Intel X99 socket 2011-3, core i7 6800, 64 GB DDR4 ram, graphic nvidia gtx 1060, and it is since the stuff has changed that the kernel needs time to find /home partition on the old stuff I never seen at boot a problem with EDAC sbridge (sincerly speaking, I don't know what it is), but now it is displayed at each boot; nevertheless, all my mageia7 systems (I have 3, all updated) successfully boot: it takes a while (2 to 3mn), but they go all the way ignore the EDAC errors... it just means kernel detects hw that technically should support it but it's not really enabled (Intel market segmentation)... and no ecc memory installed the real bug is why lvm is not properly activating soon enough... maybe using "rootdelay=" option can help slow initial boot a bit so lvm has time to properly init... (In reply to Thomas Backlund from comment #12) > ignore the EDAC errors... > it just means kernel detects hw that technically should support it but it's > not really enabled (Intel market segmentation)... and no ecc memory installed thanks for explanation about EDAC > > the real bug is why lvm is not properly activating soon enough... > > maybe using "rootdelay=" option can help slow initial boot a bit so lvm has > time to properly init... is it possible to modify "rootdelay"? the problem is that once booted, my mga7 systems, and I hope soon mga8, do fine work; I agree to accept slow boots if I can migrate to mga8 (In reply to Thomas Backlund from comment #12) > ignore the EDAC errors... > it just means kernel detects hw that technically should support it but it's > not really enabled (Intel market segmentation)... and no ecc memory installed > > the real bug is why lvm is not properly activating soon enough... > > maybe using "rootdelay=" option can help slow initial boot a bit so lvm has > time to properly init... do you think that executing a dracut with option --add "lvm mdraid" could fix the problem? the need should be autodetected. running "dracut -f" should list all bits it detects / adds So the only important lines from the journal are ... mars 14 13:12:54 mars 14 13:14:23 systemd[1]: dev-vgmageia-lvhomemga6\x2d64.device: Job dev-vgmageia-lvhomemga6\x2d64.device/start timed out. mars 14 13:14:23 systemd[1]: Timed out waiting for device /dev/vgmageia/lvhomemga6-64. mars 14 13:14:23 systemd[1]: Dependency failed for /home. $ grep DefaultTimeoutStartSec /etc/systemd/system.conf #DefaultTimeoutStartSec=90s I'm not sure if changing the DefaultTimeoutStartSec in the system.conf file is enough, but worth trying increasing that. Also I'd try adding the option x-systemd.mount-timeout=infinity to the entry for /home in /etc/fstab. I don't think rootdelay will help as the root filesystem was mounted ok. mars 14 13:12:54 dracut: Mounted root filesystem /dev/mapper/vgmageia-lvrootmga6--64 mars 14 13:12:54 dracut: Switching root (In reply to Morgan Leijström from comment #9) > May be interesting to try to boot a Live USB. > Make it with persistence, so kernel etc can be updated. Live USB boots very quickly (if use USB3.1) without any problem solved! after running a dracut from one another system, mga6-64 boots (very quickly) on mageia8 that sounds like we miss something during initrd creation. do you have more than one 5.10 series initrd in /boot ? if so it would be nice to get an lsinitrd of both slow-booting initrd and the one that boots nicely (In reply to Thomas Backlund from comment #20) > that sounds like we miss something during initrd creation. > > do you have more than one 5.10 series initrd in /boot ? > > if so it would be nice to get an lsinitrd of both slow-booting initrd and > the one that boots nicely no, I have only initrd-5.10.20-<desktop,server>-2.mga8.img (In reply to peter lawford from comment #21) > (In reply to Thomas Backlund from comment #20) > > that sounds like we miss something during initrd creation. > > > > do you have more than one 5.10 series initrd in /boot ? > > > > if so it would be nice to get an lsinitrd of both slow-booting initrd and > > the one that boots nicely > > no, I have only initrd-5.10.20-<desktop,server>-2.mga8.img well, then you have 2 :) if you only re-created one of them in comment 19, then we should be able to spot what's missing. if that's the case, please do lsinitrd on both so we can compare them (In reply to Thomas Backlund from comment #22) > (In reply to peter lawford from comment #21) > > (In reply to Thomas Backlund from comment #20) > > > that sounds like we miss something during initrd creation. > > > > > > do you have more than one 5.10 series initrd in /boot ? > > > > > > if so it would be nice to get an lsinitrd of both slow-booting initrd and > > > the one that boots nicely > > > > no, I have only initrd-5.10.20-<desktop,server>-2.mga8.img > > well, then you have 2 :) > > if you only re-created one of them in comment 19, then we should be able to > spot what's missing. > > if that's the case, please do lsinitrd on both so we can compare them unfortunately, too late! some minutes ago, I ran dracut, from the system itself (and not from one another system using chroot), with --mdadmconf as option on both kernels, because I have remarked that in the return of "cat /proc/mdstat", numbers of my raid volumes (/dev/mdxxx) were wrong: [alain4@mga6-64 ~]$ cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md126 : active raid5 sdd7[4] sdc7[7] sda7[6] sdb7[5] 1006239744 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] bitmap: 1/3 pages [4KB], 65536KB chunk md128 : active raid5 sdg6[2] sdf6[1] sde6[0] sdh6[4] 4026138624 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] bitmap: 0/10 pages [0KB], 65536KB chunk md127 : active raid5 sdc6[2] sdd6[4] sda6[5] sdb6[1] 157188096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] md122 : active raid5 sdg3[2] sdf3[1] sde3[0] sdh3[4] 94322688 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] unused devices: <none> which are right numbers; prior running dracut and rebooting, they were /dev/md<124,125,126,127> (which seems to be more logical) but it could be useful, I attach the return of lsinitrd /boot/initrd-5.10.20-desktop-2.mga8.img Created attachment 12468 [details]
return of lsinitrd /boot/initrd-5.10.20-desktop-2.mga8.img
it's the actual initrd; lsinitrd /boot/initrd-5.10.20-server-2.mga8.img returns a similar result (In reply to Thomas Backlund from comment #22) > (In reply to peter lawford from comment #21) > > (In reply to Thomas Backlund from comment #20) > > > that sounds like we miss something during initrd creation. > > > > > > do you have more than one 5.10 series initrd in /boot ? > > > > > > if so it would be nice to get an lsinitrd of both slow-booting initrd and > > > the one that boots nicely > > > > no, I have only initrd-5.10.20-<desktop,server>-2.mga8.img > > well, then you have 2 :) > > if you only re-created one of them in comment 19, then we should be able to > spot what's missing. > > if that's the case, please do lsinitrd on both so we can compare them if it could really help you, I have to migrate one another system from mageia7 to 8, and I could run dracut on only one of the 2 kernels, and see the difference between the 2 initrd's (server and desktop) thanks to lsinitrd, but this will take a large amount of time, and I won't do it today; I ask you to wait a bit (In reply to Dave Hodgins from comment #10) > Note that rebooting a live iso with persistence after updating the kernel > will not use the updated kernel as the persistence file system is not opened > until after the kernel has started. As long as the persistence is not encrypted, the new kernel will get used :) Since there are insufficient details provided in this report for us to investigate the issue further, and we have not received feedback to the information we have requested above, we will assume the problem was not reproducible, or has been fixed in one of the updates we have released for the reporter's distribution. Users who have experienced this problem are encouraged to upgrade to the latest update of their distribution, and if this issue turns out to still be reproducible in the latest update, please reopen this bug with additional information. Closing as OLD. Resolution:
(none) =>
OLD (In reply to Thomas Backlund from comment #22) > (In reply to peter lawford from comment #21) > > (In reply to Thomas Backlund from comment #20) > > > that sounds like we miss something during initrd creation. > > > > > > do you have more than one 5.10 series initrd in /boot ? > > > > > > if so it would be nice to get an lsinitrd of both slow-booting initrd and > > > the one that boots nicely > > > > no, I have only initrd-5.10.20-<desktop,server>-2.mga8.img > > well, then you have 2 :) > > if you only re-created one of them in comment 19, then we should be able to > spot what's missing. > > if that's the case, please do lsinitrd on both so we can compare them Hi! I'm back to this bug; yesterday I have upgraded to 8 one my mga7 system, and I have "dracut" only initrd-5.10.27-server-1.mga8.img but not initrd-desktop-5.10.27-1.mga8.img here attached the 2 returns of lsinitrd as you wished Created attachment 12605 [details]
return of lsinitrd /boot/initrd-5.10.27-server-1.mga8.img (dracut)
Created attachment 12606 [details]
lsinitrd /boot/initrd-5.10.27-desktop-1.mga8.img (not dracut)
|
Description of problem: I have upgrade one of my mageia7 systems to mageia8 using the method provided by your migration guide: I successively ran: 1) rpm -qa --queryformat "%{NAME}-%{version}-%{RELEASE}-%{ARCH}\n" |grep i586 |grep devel and removed all 32bits devel libs founded 2) urpmi.removemedia -a urpmi.addmedia --distrib --mirrorlist 'http://mirrors.mageia.org/api/mageia.8.$ARCH.list' (I omit the intermediate steps) 3)urpmi --auto-update --auto --force --download-all --test (/var/cache/urpmi/rpms was mounted on a huge partition > 17GB) which returned "installation is possible" 4)urpmi --auto-update --auto --force --download-all (the same as above without --test) and after a couple of hours, 4302 on 4303 downloaded rpm packages were intalled (only icedtea-web-1.8.2-2.mga8 was not) I consider that the upgrade was successful but the system couldn't reboot: everything ran OK to the step "research of peripherals" after, it indefinitely many times invited me to type Ctrl+D to continue here attached 2 screenshots which shows what happened. hence, migrating from mageia7 to mageia8 seems not to be possible Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3.