Bug 29515 - Mageia Cauldron - X freezes randomly and frequently when heavy programs are running
Summary: Mageia Cauldron - X freezes randomly and frequently when heavy programs are r...
Status: RESOLVED INVALID
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: x86_64 Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-01 22:32 CEST by Régis Imbeault
Modified: 2021-10-10 19:01 CEST (History)
1 user (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments
The content of Xorg.0.log as of today 2021-10-01 (50.88 KB, text/plain)
2021-10-01 22:32 CEST, Régis Imbeault
Details
Information after the latest failure of X (693.31 KB, text/plain)
2021-10-07 19:02 CEST, Régis Imbeault
Details
Screenshot of Thunderbird bugging right before X failure (139.99 KB, image/png)
2021-10-07 19:18 CEST, Régis Imbeault
Details
Information following a new crash of X (50.57 KB, application/zip)
2021-10-08 00:08 CEST, Régis Imbeault
Details

Description Régis Imbeault 2021-10-01 22:32:59 CEST
Created attachment 12934 [details]
The content of Xorg.0.log as of today 2021-10-01

Hello,

I need to report that since upgrading to Mageia Cauldron from Mageia 8 (x86-64) on my office tower a few months ago, I have been experiencing frequent crashes of the graphical interface that force me to do hard reboots. What happens is that everything suddenly freezes and becomes unresponsive. Doing Ctrl/Alt/Backspace-Backspace or Ctrl/Alt/Delete or trying to shift to a Text Interface does not work, and I need to do a hard reboot.

Since this is happening at random, I first suspected that this was caused by one of my RAM bars that was done for. However, running Memtest86 overnight revealed no problem. The crashes are not exactly the same as a RAM failure either. Unlike the latter case where everything would completely freeze, right now the graphical interface freezes but if music is playing (either Audacious or Youtube) it keeps playing.

This seems to be more prevalent when I have 'heavy' programs running simultaneously: Firefox, Thunderbird, LibreOffice, Virtualbox. But it also happened when only Firefox was running. I need to add that when I had Mageia 8 (where I never experienced any such problem), I was running on Cinnamon, but now with Cauldron I had to switch to XFCE because crashes with Cinnamon are just too frequent.

I unfortunately don't know of a way to reproduce the freezes. I just work and I know it will happen eventually. But I do seem to have noticed lately that some kernels appear to be more prone to the crashes than others. I am right now running kernel 5.14.7-desktop-1 for which I have yet to see a crash. But kernel 5.14.8-1 has been unusable for me, and I did experience a crash on kernel 5.14.9-1.

I don't know what information is relevant to you, but please, find here attached the content of Xorg.0.log, for a start. I will be happy to provide you with any more information that you might need.

here is also the output of lspci -v:
lspci -v
00:00.0 Host bridge: Intel Corporation 4 Series Chipset DRAM Controller (rev 03)
	Subsystem: Dell Device 027f
	Flags: bus master, fast devsel, latency 0
	Capabilities: <access denied>

00:01.0 PCI bridge: Intel Corporation 4 Series Chipset PCI Express Root Port (rev 03) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 29
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	I/O behind bridge: 0000d000-0000dfff [size=4K]
	Memory behind bridge: fa000000-fdefffff [size=63M]
	Prefetchable memory behind bridge: 00000000d0000000-00000000dfffffff [size=256M]
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:03.0 Communication controller: Intel Corporation 4 Series Chipset HECI Controller (rev 03)
	Subsystem: Dell Device 027f
	Flags: bus master, fast devsel, latency 0, IRQ 32
	Memory at f0400800 (64-bit, non-prefetchable) [size=16]
	Capabilities: <access denied>
	Kernel driver in use: mei_me
	Kernel modules: mei_me

00:03.2 IDE interface: Intel Corporation 4 Series Chipset PT IDER Controller (rev 03) (prog-if 85 [PCI native mode-only controller, supports bus mastering])
	Subsystem: Dell Device 027f
	Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 18
	I/O ports at fe80 [size=8]
	I/O ports at fe90 [size=4]
	I/O ports at fea0 [size=8]
	I/O ports at feb0 [size=4]
	I/O ports at fef0 [size=16]
	Capabilities: <access denied>
	Kernel driver in use: ata_generic
	Kernel modules: pata_acpi, ata_generic

00:03.3 Serial controller: Intel Corporation 4 Series Chipset Serial KT Controller (rev 03) (prog-if 02 [16550])
	Subsystem: Dell Device 027f
	Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 17
	I/O ports at ec98 [size=8]
	Memory at fdfd8000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: serial

00:19.0 Ethernet controller: Intel Corporation 82567LM-3 Gigabit Network Connection (rev 02)
	Subsystem: Dell Device 027f
	Flags: bus master, fast devsel, latency 0, IRQ 33
	Memory at fdfe0000 (32-bit, non-prefetchable) [size=128K]
	Memory at fdfd9000 (32-bit, non-prefetchable) [size=4K]
	I/O ports at ecc0 [size=32]
	Capabilities: <access denied>
	Kernel driver in use: e1000e
	Kernel modules: e1000e

00:1a.0 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #4 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Dell Device 027f
	Flags: bus master, medium devsel, latency 0, IRQ 16
	I/O ports at ff20 [size=32]
	Capabilities: <access denied>
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci_hcd

00:1a.1 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #5 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Dell Device 027f
	Flags: bus master, medium devsel, latency 0, IRQ 17
	I/O ports at ff00 [size=32]
	Capabilities: <access denied>
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci_hcd

00:1a.2 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #6 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Dell Device 027f
	Flags: bus master, medium devsel, latency 0, IRQ 22
	I/O ports at fc00 [size=32]
	Capabilities: <access denied>
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci_hcd

00:1a.7 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB2 EHCI Controller #2 (rev 02) (prog-if 20 [EHCI])
	Subsystem: Dell Device 027f
	Flags: bus master, medium devsel, latency 0, IRQ 22
	Memory at fdfda000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: <access denied>
	Kernel driver in use: ehci-pci
	Kernel modules: ehci_pci

00:1b.0 Audio device: Intel Corporation 82801JD/DO (ICH10 Family) HD Audio Controller (rev 02)
	Subsystem: Dell Device 027f
	Flags: bus master, fast devsel, latency 0, IRQ 34
	Memory at fdfdc000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel

00:1c.0 PCI bridge: Intel Corporation 82801JD/DO (ICH10 Family) PCI Express Port 1 (rev 02) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
	I/O behind bridge: 00001000-00001fff [size=4K]
	Memory behind bridge: f9f00000-f9ffffff [size=1M]
	Prefetchable memory behind bridge: 00000000f0000000-00000000f01fffff [size=2M]
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1c.1 PCI bridge: Intel Corporation 82801JD/DO (ICH10 Family) PCI Express Port 2 (rev 02) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 17
	Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
	I/O behind bridge: 00002000-00002fff [size=4K]
	Memory behind bridge: f9e00000-f9efffff [size=1M]
	Prefetchable memory behind bridge: 00000000f0200000-00000000f03fffff [size=2M]
	Capabilities: <access denied>
	Kernel driver in use: pcieport

00:1d.0 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #1 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Dell Device 027f
	Flags: bus master, medium devsel, latency 0, IRQ 23
	I/O ports at ff80 [size=32]
	Capabilities: <access denied>
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci_hcd

00:1d.1 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #2 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Dell Device 027f
	Flags: bus master, medium devsel, latency 0, IRQ 17
	I/O ports at ff60 [size=32]
	Capabilities: <access denied>
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci_hcd

00:1d.2 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #3 (rev 02) (prog-if 00 [UHCI])
	Subsystem: Dell Device 027f
	Flags: bus master, medium devsel, latency 0, IRQ 18
	I/O ports at ff40 [size=32]
	Capabilities: <access denied>
	Kernel driver in use: uhci_hcd
	Kernel modules: uhci_hcd

00:1d.7 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB2 EHCI Controller #1 (rev 02) (prog-if 20 [EHCI])
	Subsystem: Dell Device 027f
	Flags: bus master, medium devsel, latency 0, IRQ 23
	Memory at ff980000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: <access denied>
	Kernel driver in use: ehci-pci
	Kernel modules: ehci_pci

00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a2) (prog-if 01 [Subtractive decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=04, subordinate=04, sec-latency=32
	I/O behind bridge: [disabled]
	Memory behind bridge: [disabled]
	Prefetchable memory behind bridge: [disabled]
	Capabilities: <access denied>

00:1f.0 ISA bridge: Intel Corporation 82801JD (ICH10D) LPC Interface Controller (rev 02)
	Subsystem: Dell Device 027f
	Flags: bus master, medium devsel, latency 0
	Capabilities: <access denied>
	Kernel driver in use: lpc_ich
	Kernel modules: lpc_ich

00:1f.2 SATA controller: Intel Corporation 82801JD/DO (ICH10 Family) SATA AHCI Controller (rev 02) (prog-if 01 [AHCI 1.0])
	Subsystem: Dell Device 027f
	Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 30
	I/O ports at fe00 [size=8]
	I/O ports at fe10 [size=4]
	I/O ports at fe20 [size=8]
	I/O ports at fe30 [size=4]
	I/O ports at fec0 [size=32]
	Memory at f0400000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: <access denied>
	Kernel driver in use: ahci

00:1f.3 SMBus: Intel Corporation 82801JD/DO (ICH10 Family) SMBus Controller (rev 02)
	Subsystem: Dell Device 027f
	Flags: medium devsel, IRQ 18
	Memory at fdfdb000 (64-bit, non-prefetchable) [size=256]
	I/O ports at ece0 [size=32]
	Kernel driver in use: i801_smbus
	Kernel modules: i2c_i801

01:00.0 VGA compatible controller: NVIDIA Corporation G84 [GeForce 8600 GT] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. Device 8258
	Flags: bus master, fast devsel, latency 0, IRQ 31
	Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
	I/O ports at dc80 [size=128]
	Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: nouveau
	Kernel modules: nvidiafb, nouveau

Thank you!
Comment 1 Marja Van Waes 2021-10-05 22:41:01 CEST
(In reply to Régis Imbeault from comment #0)
> Created attachment 12934 [details]

> 
> I unfortunately don't know of a way to reproduce the freezes. I just work
> and I know it will happen eventually. But I do seem to have noticed lately
> that some kernels appear to be more prone to the crashes than others. I am
> right now running kernel 5.14.7-desktop-1 for which I have yet to see a
> crash. But kernel 5.14.8-1 has been unusable for me, and I did experience a
> crash on kernel 5.14.9-1.
> 


Assigning to the kernel and drivers maintainers. 

Régis, can you please boot the newest kernel that you have now and after it crashed and you boot your system again, run as root in a terminal:

   journalctl -b-1 2>&1 | tee crash.txt

and then attach crash.txt to this bug report

Assignee: bugsquad => kernel
Keywords: (none) => NEEDINFO
CC: (none) => marja11

Comment 2 Régis Imbeault 2021-10-07 19:02:09 CEST
Created attachment 12939 [details]
Information after the latest failure of X

Hello Marja,

Yes, here is an example that occurred with kernel 5.14.9-desktop-3.mga9. I attached the crash.txt file, as per your instruction.

But I need to provide further details for this last "crash".
1) While I was writing an e-mail with Thunderbird (91.1.2-1.mga9) the lettering became gradually blurry and unreadable. I fortunately had the time to take a screenshot (that I will attach subsequently to show you what I mean).

2) After I took the screenshot, my session stopped responding. Strangely, I could still move the mouse around, but clicking on anything or anywhere did nothing. Nor was it responding to the keyboard. The session was freezing.

3) I waited a little bit to see at what moment the mouse would stop moving and I would get a total freeze. That strangely did not happen, but just when I was about to do a hard reboot of my computer, a restart of X occurred. It turns this time, it responded to one of many "Ctrl/Alt/Backspace + Backspace" I did several seconds ago.

4) When the new graphical session restarted normally, I immediately created the crash.txt file that you have here attached. It was not a reboot, but I hope this can still provide useful information.
Comment 3 Régis Imbeault 2021-10-07 19:18:01 CEST
Created attachment 12940 [details]
Screenshot of Thunderbird bugging right before X failure

Here is the screenshot I took just before my session became unresponsive.

This is not necessarily a behaviour I observe each time my session is about to freeze, but I did observe it a couple of times.

I will report again, the next time I get another crash and I'm forced to do a hard reboot.

Many thanks!
Comment 4 Régis Imbeault 2021-10-08 00:08:51 CEST
Created attachment 12941 [details]
Information following a new crash of X

Hello,

Here, I attach the journalctl output following a new (actual) crash of the graphical interface, still with kernel 5.14.9-desktop-3.mga9

This time, everything just froze instantly and I was forced to do a hard reboot.
All that I had running was Firefox (91.2.0-1.mga9) and Thunderbird (91.1.2-1.mga9) when it happened.
I did observe a new glitch in Thuderbird before the session froze. This time, the whole row of thumbnails (and the thumbnails too) became transparent, excepted for the writings in the thumbnails that remained the same and and still readable, but on a transparent background (could not take a screenshot this time).

Thank you!
Comment 5 Marja Van Waes 2021-10-08 19:48:15 CEST
(In reply to Régis Imbeault from comment #4)
> Created attachment 12941 [details]
> Information following a new crash of X
> 

Thanks for the feedback!

I'm sorry to tell you that both crash logs show that your BIOS is broken.

Both do also say (I'm not sure that is possible at all, if the BIOS is currently broken) that you should try to upgrade to a newer BIOS.

Oct 07 12:28:52 localhost.localdomain kernel: ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 128/64 (20210604/tbfadt-564)
Oct 07 12:28:52 localhost.localdomain kernel: DMAR: [Firmware Bug]: Your BIOS is broken; DMAR reported at address fedc1000 returns all ones!
                                              BIOS vendor: Dell Inc.; Ver: A05; Product Version: 
Oct 07 12:28:52 localhost.localdomain kernel: ACPI BIOS Warning (bug): Incorrect checksum in table [TCPA] - 0x00, should be 0x7F (20210604/tbprint-173)
Oct 07 12:28:52 localhost.localdomain kernel: ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
Oct 07 12:28:53 localhost.localdomain kernel: BIOS EDD facility v0.16 2004-Jun-25, 1 devices found
Oct 07 12:29:03 localhost.localdomain kernel: dell-smbios A80593CE-A997-11DA-B012-B622A1EF5492: WMI SMBIOS userspace interface not supported(0), try upgrading to a newer BIOS

So the broken BIOS is very likely the cause of the problems. I'm closing this report as invalid.

Resolution: (none) => INVALID
Keywords: NEEDINFO => (none)
Status: NEW => RESOLVED

Comment 6 Régis Imbeault 2021-10-08 21:47:19 CEST
Thank you so much for your time and your diagnosis.

I apologize for this false alarm.
I can't help but feel foolish for not having found that out myself and bothered you for it.

Kind regards!
Comment 7 Marja Van Waes 2021-10-10 19:01:04 CEST
Don't worry, Régis, that could have happened to me, too ;-)
We're humans, not perfectly well programmed computers :-)

Note You need to log in before you can comment on or make changes to this bug.