| Summary: | 4TB unmounted unpartitioned drive makes installer freeze when bootloader is configured, but also X in installed cauldron when kernel is updated. | ||
|---|---|---|---|
| Product: | Mageia | Reporter: | George Mitchell <george> |
| Component: | RPM Packages | Assignee: | Base system maintainers <basesystem> |
| Status: | RESOLVED OLD | QA Contact: | |
| Severity: | critical | ||
| Priority: | Normal | CC: | davidwhodgins, kde, kernel, marja11, ouaurelien, thierry.vignaud, zen25000 |
| Version: | Cauldron | ||
| Target Milestone: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Source RPM: | CVE: | ||
| Status comment: | |||
| Attachments: |
lspci output
fdisk output System log at time of freeze Syslog output from partially failed kernel upgrade via MCC |
||
|
Description
George Mitchell
2016-12-15 20:56:02 CET
Depending on your tests the summary might need to be adjusted again.
I added the size of your HD, because I heard twice in the last weeks about problems Mageia users had with 4TB disks, and I heard no problems about smaller disks
As last step of a kernel update, bootloader-config is run. That is similar to what happens when you configure your bootloader while installing Mageia.
Apart from Grub2, other tools are called (blkid, os-prober) to probe all available disks and partitions, so that all installed OSs will be found and added in the Grub2 boot menu. So the problem does not seem to be with installer, but with a tool.
Can you please *attach* lspci.txt that is the result of:
lspcidrake -v > lspci.txt
and also attach fdisk.txt after running, as _root_:
fdisk -l /dev/sd? > fdisk.txt
(Both from when that 4TB drive is powered on.)
Also, if you could attach the logs from such a kernel upgrade that froze X, that would be great.
Run, again as root, (after adjusting date & time so that your kernel update is included):
journalctl -a --since="2016-12-11 09:00" --until="2016-12-11 10:00" > log.txt
and attach log.txt to this report.CC:
(none) =>
kernel, marja11, zen25000 Created attachment 8784 [details]
lspci output
Created attachment 8785 [details]
fdisk output
Created attachment 8786 [details]
System log at time of freeze
I am noticing from the log that there is an attempt to mount the partitionless drive as part of the process. That certainly could explain the drive activity light remaining on for an extended period. Mounting a 4TB volume takes a long time on this system as I am CPU constrained with an old dual core Pentium. But it doesn't usually freeze the system up, but who knows? Perhaps I have to try again and leave it to cook for a while and see what happens. Perhaps I just did not wait long enough. I don't see any critical errors on the log. (In reply to George Mitchell from comment #5) > I am noticing from the log that there is an attempt to mount the > partitionless drive as part of the process. That certainly could explain > the drive activity light remaining on for an extended period. Mounting a > 4TB volume takes a long time on this system as I am CPU constrained with an > old dual core Pentium. But it doesn't usually freeze the system up, but who > knows? Perhaps I have to try again and leave it to cook for a while and see > what happens. Perhaps I just did not wait long enough. I don't see any > critical errors on the log. Well there are 9 messages about > The X11 connection broke (error 1). Did the X11 server die?" The first thing I see after the disk was mounted, are some plasmashell messages. I don't have the slightest idea whether they're related to X freezing. Even if they would be related, they _cannot_ be the cause because you hit this bug in traditional installer, too. Dec 14 21:16:39 localhost ghmitch[30024]: 50mounted-tests: debug: btrfs volume 534d18b0-fc56-42e6-bfeb-c63b0f0bdc07 mounted Dec 14 21:16:41 localhost plasmashell[6714]: QFileInfo::absolutePath: Constructed with empty filename Dec 14 21:17:08 localhost kernel: BTRFS info (device sdi): disk space caching is enabled Dec 14 21:17:29 localhost plasmashell[6714]: file:///usr/lib64/qt5/qml/QtQuick/Controls/Button.qml:96: TypeError: Cannot read property of null Dec 14 21:17:30 localhost plasmashell[6714]: QXcbConnection: XCB error: 2 (BadValue), sequence: 12014, resource id: 81794524, major code: 141 (Unknown), minor code: 3 later there's (skipping some lines here and there): Dec 14 21:17:38 localhost pulseaudio[5978]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" Dec 14 21:17:38 localhost kdeinit5[6058]: kdeinit5: Fatal IO error: client killed Dec 14 21:17:38 localhost org.a11y.atspi.Registry[6816]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" Dec 14 21:17:38 localhost klauncher[6059]: The X11 connection broke (error 1). Did the X11 server die? Dec 14 21:17:38 localhost pulseaudio[5978]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" Dec 14 21:17:38 localhost kdeinit5[6058]: kdeinit5: Fatal IO error: client killed Dec 14 21:17:38 localhost org.a11y.atspi.Registry[6816]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" Dec 14 21:17:38 localhost klauncher[6059]: The X11 connection broke (error 1). Did the X11 server die? Guessing the problem is at basesystem level, and assigning accordingly Keywords:
NEEDINFO =>
(none) Oops, seems I pasted the last log lines twice, sorry about that! I will admit that I am seeing a lot of instability with the plasma desktop and it is freezing up periodically and randomly. But when it happened with the installer there was no plasma desktop involved. So they are clearly two different problems. I am feeling that neither the installer nor the drakboot nor the package manager should be trying to mount volumes without the users explicit permission. The only reason the volume is being mounted is to try to discover if there are OS's on it. The user should have the option of skipping that step because it may be unnecessary and can lead to trouble in some cases. I will follow up with more comments as I continue to learn more about what is going on with this. (In reply to George Mitchell from comment #8) > But when it happened with > the installer there was no plasma desktop involved. So they are clearly two > different problems. Well, in both cases X freezes when grub2 is configured, so I'm not yet convinced they're different issues. Could you please redo the installation with that disk powered on and unmounted and try, after the freeze, whether you can still switch to tty2 with "Ctrl+Alt+F2" ? If that succeeds, then please attach a USB key and type bug That'll write report.bug to the USB key If you cannot switch to tty2, then fetch /root/drakx/ddebug.log from the root partition you were installing to, before reusing that partition You'll probably need to compress report.bug or ddebug.log. Please do that with xz: xz report.bug or xz ddebug.log and attach report.bug.xz or ddebug.log.xz to this bug report. (This bug report will be cloned if they are really two different bugs) CC:
(none) =>
thierry.vignaud (In reply to George Mitchell from comment #8) > I am feeling that neither the installer nor the > drakboot nor the package manager should be trying to mount volumes without > the users explicit permission. The only reason the volume is being mounted > is to try to discover if there are OS's on it. The user should have the > option of skipping that step because it may be unnecessary and can lead to > trouble in some cases. You can file an enhancement request for that. (In reply to Marja van Waes from comment #10) > > Could you please redo the installation with that disk powered on and > unmounted and try, after the freeze, whether you can still switch to tty2 > with "Ctrl+Alt+F2" ? > Don't do a hard poweroff if that succeeds. When you're done writing report.bug to the USB key, you can use "Alt + Ctrl + Del" to reboot. Or use Alt + SysRq , keep those two keys pressed, and very slowly type the sequence: R S E I U O to poweroff cleanly. Given that mkfs can be run on a device, rather then a partition, I'm curious, is the drive actually blank or has it been formatted previously, without a partition table? I.E. does dd if=/dev/sdi bs=512 count=1|od -x show anything other then zeroes for the content? CC:
(none) =>
davidwhodgins (In reply to Marja van Waes from comment #11) > (In reply to George Mitchell from comment #8) > > I am feeling that neither the installer nor the > > drakboot nor the package manager should be trying to mount volumes without > > the users explicit permission. The only reason the volume is being mounted > > is to try to discover if there are OS's on it. The user should have the > > option of skipping that step because it may be unnecessary and can lead to > > trouble in some cases. > > You can file an enhancement request for that. Such option was actually already added I believe https://bugs.mageia.org/show_bug.cgi?id=18538 Whoa. That is a huge request. This installation took me three or four days to do and I really don't have the time to do it again. The network install is tedious because the Cauldron repository is constantly being updated and the installation breaks every time it hits an updated package and has to be manually restarted again with the existing installed packages being upgraded first. But what I WOULD like to is install a very simple desktop like xfce4 and try the kernel upgrade from there and see what happens. When I do that, I will attach the syslog data just as I did previously with Plasma out of the picture. (In reply to Marja van Waes from comment #10) > (In reply to George Mitchell from comment #8) > > But when it happened with > > the installer there was no plasma desktop involved. So they are clearly two > > different problems. > > Well, in both cases X freezes when grub2 is configured, so I'm not yet > convinced they're different issues. > > Could you please redo the installation with that disk powered on and > unmounted and try, after the freeze, whether you can still switch to tty2 > with "Ctrl+Alt+F2" ? > > If that succeeds, then please attach a USB key and type > > bug > > That'll write report.bug to the USB key > > > If you cannot switch to tty2, then fetch /root/drakx/ddebug.log from the > root partition you were installing to, before reusing that partition > > You'll probably need to compress report.bug or ddebug.log. Please do that > with xz: > > xz report.bug > > or > > xz ddebug.log > > > and attach report.bug.xz or ddebug.log.xz to this bug report. > > (This bug report will be cloned if they are really two different bugs) (In reply to Dave Hodgins from comment #13) > Given that mkfs can be run on a device, rather then a partition, I'm curious, > is the drive actually blank or has it been formatted previously, without a > partition table? I.E. does > dd if=/dev/sdi bs=512 count=1|od -x > show anything other then zeroes for the content? The drive is formatted btrfs with multiple btrfs volumes. Mounting the drive takes forever, but mounting individual volumes on the drive is significantly faster. I rarely mount the whole drive. (In reply to George Mitchell from comment #16) > (In reply to Dave Hodgins from comment #13) > > Given that mkfs can be run on a device, rather then a partition, I'm curious, > > is the drive actually blank or has it been formatted previously, without a > > partition table? I.E. does > > dd if=/dev/sdi bs=512 count=1|od -x > > show anything other then zeroes for the content? > > The drive is formatted btrfs with multiple btrfs volumes. Mounting the > drive takes forever, but mounting individual volumes on the drive is > significantly faster. I rarely mount the whole drive. And blkid should be able to tell the installer that. (In reply to Pascal Terjan from comment #14) > (In reply to Marja van Waes from comment #11) > > (In reply to George Mitchell from comment #8) > > > I am feeling that neither the installer nor the > > > drakboot nor the package manager should be trying to mount volumes without > > > the users explicit permission. The only reason the volume is being mounted > > > is to try to discover if there are OS's on it. The user should have the > > > option of skipping that step because it may be unnecessary and can lead to > > > trouble in some cases. > > > > You can file an enhancement request for that. > > Such option was actually already added I believe > https://bugs.mageia.org/show_bug.cgi?id=18538 Is that the check box determining whether os-prober gets run or not? If so, I did not understand at the time that running os-prober would attempt to mount unmounted that were not partitioned, so my bad on that one if that is the case. Now I know. What I am really wondering and want to sort out at this point is whether this was really a malfunction or whether the system seemed frozen simply because the mount process was eating all my CPU time along with Plasma desktop. I know that mounting that 4TB volume in one chunk is extremely CPU intensive on my system. Created attachment 8811 [details]
Syslog output from partially failed kernel upgrade via MCC
Today a new kernel upgrade came out for Cauldren. I did the install via MCC on XFCE4 rather than buggy Plasma. If finally completed without hanging but took forever. On XFCE4 it requested specific permission to mount volumes which I provided. Everything went normally until it hit the unpartitioned 4TB drive. At that point everything seemed to stop but I just waited and waited. The MCC installer finally came back to life and completed normally with a message to reboot for new kernel. BUT the OS prober continued to be hung on /dev/sdi the unpartitioned drive and just kept going until I finally manually killed it and shut the system down. The attached document contains the whole syslog output for the complete time period.
os-prober should not bug on unpartitioned drive. Therefore, there is 4 years since last comment. We are sorry, but we no longer maintains this version of Mageia. Please upgrade to the latest version and reopen this bug against that version if this bug exists there. As a result we are setting this bug to RESOLVED:OLD Resolution:
(none) =>
OLD |