Bug 26490 - X-Window system hangs unexpectedly
Summary: X-Window system hangs unexpectedly
Status: RESOLVED OLD
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: 7
Hardware: All Linux
Priority: Normal major
Target Milestone: ---
Assignee: Thomas Backlund
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-16 17:41 CEST by François PELLEGRINI
Modified: 2021-09-07 14:10 CEST (History)
3 users (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments
System journal (44.12 KB, application/x-xz)
2020-04-18 20:51 CEST, François PELLEGRINI
Details
Another system journal (57.79 KB, application/x-xz)
2020-04-21 22:58 CEST, François PELLEGRINI
Details

Description François PELLEGRINI 2020-04-16 17:41:46 CEST
Description of problem:
Randomly, say 3-4 times a day, the X-window environment hangs (mouse and keyboard shortcuts do not work). Yet, the system seems to go on working (hard disk and network activity LEDs still active). I have to reboot the machine.

Version-Release number of selected component (if applicable):
This happens since my upgrade from Mageia 6 to Mageia 7. It never happended with Mageia 6.

How reproducible:
andomly, say 3-4 times a day.

Steps to Reproduce:
Work as usual.   ;-)
François PELLEGRINI 2020-04-16 17:41:55 CEST

CC: (none) => pelegrin

Comment 1 Lewis Smith 2020-04-16 21:08:01 CEST
Sorry for the trouble you are having.

We need more information. It seems you have just upgraded to Mageia 7. Is the system fully up-to-date ?
Please describe briefly your system, in particular the graphics hardware. It is easy to do:
 $ lspci -v
then post here *just* the section starting "VGA compatible controller".

> mouse and keyboard shortcuts do not work
Please clarify this. Does the mouse simply stop working ? Does *all* keyboard input stop working (you say 'shortcuts') ? In particular, can you do Ctrl/Alt/Fn (2-6) to get to a virtual console ?

Please say what desktop you are using.

Please attach /var/log/Xorg.0.log to this bug. Ideally after a failure, when you will need to go to a virtual console if you can (Ctrl/Alt/F2-6), login as your normal user, then do for example:
 $ cp /var/log/Xorg.0.log ~
which will copy it to your home directory. Logout.
When you are back in action (see below), attach the file ~/Xorg.0.log to the bug.

When the problem happens, please try :
1. If you can, Ctrl/Alt/Fn (2-6) which should show a virtual console ;
then Ctrl/Alt/F1 which should get you back to the graphical screen.
Does this bring it back to life ?

2. "I have to reboot the machine"
If at least this works : Ctrl/Alt/Bksp/Bksp should re-start the X-server, and puts you back quickly to the login screen. Not ideal, but much easier than a re-boot.

CC: (none) => lewyssmith

Comment 2 François PELLEGRINI 2020-04-18 10:51:02 CEST
Bonjour Lewis,
Thanks for getting back to me so quickly.
So, to tell you more :

My machine is a 32-bit Thinkpad R400 (oldies but goodies);

I have a daily updated Mageia 7. The bug started to happenfrom the day I switched to Mageia 6.

lspci -v yields :
[...]
00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA controller])
	Subsystem: Lenovo Device 20e4
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Memory at f4400000 (64-bit, non-prefetchable) [size=4M]
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	I/O ports at 1800 [size=8]
	[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: i915
	Kernel modules: i915
[...]

All interactions are frozen. Mouse, keyboard, etc. No screen switch possible to get back to the command line and kill the beast. Yet, as said, the system seems to be working (HD and network activity LEDs blinking as usual).

I will try the "Ctrl/Alt/Bksp/Bksp" trick and see if it works. I will send you the Xorg.0.log accordingly, if the system survives. :-)

Thanks a lot for your support,

f.p.
Comment 3 Lewis Smith 2020-04-18 20:22:12 CEST
Thanks for the extra info; but you still need to say what desktop.

See also bug 26495 which may be a duplicate of this one.

That reminds me of another bit of evidence you can add after you have re-booted from a freeze. As root:
 # journalctl -ab -1 --no-hostname > journal.txt
This outputs the entire journal from the previous session to file journal.txt
 $ xz journal.txt
compresses it to file journal.txt.xz
Please attach the last compressed journal file to this bug.
Comment 4 François PELLEGRINI 2020-04-18 20:51:27 CEST
Created attachment 11596 [details]
System journal

Oops, sorry. My desktop is Gnome.

%  rpm -qa | grep gnome
gnome-power-manager-3.32.0-2.mga7
libgnome-keyring0-3.12.0-11.mga7
gnome-shell-extensions-common-3.32.1-1.mga7
libgnome-desktop-gir3.0-3.32.2-1.mga7
gnome-settings-daemon-3.32.0-2.mga7
gnome-calculator-3.32.1-2.mga7
gnome-desktop-3.32.2-1.mga7
gnome-online-accounts-3.32.0-2.mga7
libgnome-menu3_0-3.32.0-1.mga7
gnome-video-effects-0.4.3-2.mga7
libgnome-bluetooth13-3.32.1-2.mga7
gnome-shell-extension-topicons-22-2.mga7
gnome-session-3.32.0-2.mga7
gnome-packagekit-common-3.32.0-2.mga7
libproxy-gnome-0.4.15-4.mga7
gnome-shell-extensions-launch-new-instance-3.32.1-1.mga7
libgnome-bluetooth-gir1.0-3.32.1-2.mga7
gnome-shell-extensions-user-theme-3.32.1-1.mga7
gnome-music-3.32.2-1.mga7
gnome-sound-recorder-3.32.0-2.mga7
gnome-clocks-3.32.0-2.mga7
gnome-shell-extensions-apps-menu-3.32.1-1.mga7
gnome-user-docs-3.32.2-1.mga7
gnome-nettool-3.8.1-8.mga7
gnome-bluetooth-3.32.1-2.mga7
libgnome-keyring-i18n-3.12.0-11.mga7
libgnomecanvas2_0-2.30.3-11.mga7
gnome-tweaks-3.32.0-3.mga7
gnome-classic-session-3.32.1-1.mga7
task-gnome-minimal-3.32.0-2.mga7
gnome-themes-extra-3.28-4.mga7
gnome-photos-3.32.0-2.mga7
gnome-screenshot-3.32.0-2.mga7
pinentry-gnome3-1.1.0-5.mga7
libgnome-keyring-gir1.0-3.12.0-11.mga7
libgnome-desktop3_12-3.24.2-1.mga6
libgnomekbd-common-3.26.1-1.mga7
gnome-shell-3.32.1-2.mga7
libgnome-autoar0_0-0.2.3-2.mga7
libgnomekbd8-3.26.1-1.mga7
libgnome-desktop3_17-3.32.2-1.mga7
libgnomecanvas-2.30.3-11.mga7
gnome-color-manager-3.32.0-3.mga7
gnome-keyring-3.31.91-2.mga7
chrome-gnome-shell-10.1-3.mga7
gnome-directory-thumbnailer-0.1.10-4.mga7
gnome-contacts-3.32.1-1.mga7
gnome-shell-extensions-places-menu-3.32.1-1.mga7
gnome-session-bin-3.32.0-2.mga7
gnome-terminal-3.32.2-1.mga7
gnome-epub-thumbnailer-1.5-4.mga7
gnome-control-center-3.32.1-2.mga7
gnome-doc-utils-0.20.10-12.mga7
gnome-weather-3.32.2-1.mga7
gnome-terminal-nautilus-3.32.2-1.mga7
gnome-shell-extensions-window-list-3.32.1-1.mga7
gnome-online-miners-3.30.0-1.mga7

Indeed, bug #26495 looks quite similar. His machine, unlike mine, is an x86_64. So the bug seems to be hardware-independent.

Also, my sentence in the previous message is of course wrong. It happened from the day I switched *from* Mageia 6.  :-)

I send you now the current state of my journal, as it records events of yesterday, including hangs. It contains many offending Gnome error messages, mostly "out of order" events, which shows that something is wrong anyway. Plus mutex weird things about the i915 driver. I guess that may already be of use to you.

Regards,

f.p.
Comment 5 Lewis Smith 2020-04-18 21:29:50 CEST
Thank you for the journal.
> my journal, as it records events of yesterday, including hangs
It looks as if you booted at 14.52, and continued working (with a 'sleep' period in the middle) to 22.24. I cannot see any evidence of a re-boot during this period (which anyway would start a new journal). Did the system freeze right at the end?

22:23:58 kernel: INFO: task gnome-shell:23948 blocked for more than 122 seconds

might be significant.
Comment 6 François PELLEGRINI 2020-04-21 22:58:30 CEST
Created attachment 11601 [details]
Another system journal

Hello,

This is another journal extract, when a hang occured on 21/4/20, at 22:22.

Several issues :
- CTRL/ALT/BKSP/BKSP does not work. Hence, the problem might be deeper. A loss of USB peripherals?
- Since the system was still functioning, I tried to trigger it by putting the computer on battery and closing the lid. This event was recorded by the system, but caused no change in display (the system did not go to suspend mode).
Regards,
                             f.p.
Comment 7 Lewis Smith 2020-04-22 13:16:32 CEST
Thank you for the new journal and the precision about the associated freeze. Cannot do better than that.

Unsure where to assign this: Gnome or everyone. Start with the latter.

Assignee: bugsquad => pkg-bugs
CC: lewyssmith => (none)

Comment 8 José Jorge 2020-04-22 14:26:45 CEST
(In reply to François PELLEGRINI from comment #6)
> - Since the system was still functioning, I tried to trigger it by putting
> the computer on battery and closing the lid. This event was recorded by the
> system, but caused no change in display (the system did not go to suspend
> mode).

I have the same problem with my daughter's laptop which has the same graphics card : Intel Mobile 4 series. I can still ssh to the system, and ask it to shutdown, but it keeps hanging on waiting for graphics chip.

Looks like this 2009 hardware has gotten a regression with the graphics stack upgrade. I think this came when we switched to kernel 5.3.11 as it is the first crash I see in his log. 

Assigning to tmb because he may have a clue in kernel ML. Here is oldest crash :
 
nov 26 09:51:04 insys.local kernel: WARNING: CPU: 1 PID: 5987 at fs/ext4/inode.c:3941 ext4_set_page_dirty+0x3e/0x50
nov 26 09:51:04 insys.local kernel: Modules linked in: hid_generic usbhid hid uas usb_storage dm_mod loop ccm fuse bnep af_packet ib_core bluetooth ecdh_generic ecc msr sunrpc rtl8187 mac80211 cfg80211 eeprom_93cx6 rfkill libarc4 jo>
nov 26 09:51:04 insys.local kernel: CPU: 1 PID: 5987 Comm: kworker/u4:72 Not tainted 5.3.11-desktop-1.mga7 #1
nov 26 09:51:04 insys.local kernel: Hardware name: CLEVO CO.                        W760T/M740T/M760T               /W760T/M740T/M760T               , BIOS 1.00.05 09/17/2009
nov 26 09:51:04 insys.local kernel: Workqueue: i915 __i915_gem_free_work [i915]
nov 26 09:51:04 insys.local kernel: RIP: 0010:ext4_set_page_dirty+0x3e/0x50
nov 26 09:51:04 insys.local kernel: Code: 48 8b 00 a8 01 75 16 48 8b 57 08 48 8d 42 ff 83 e2 01 48 0f 44 c7 48 8b 00 a8 08 74 0d 48 8b 07 f6 c4 20 74 0f e9 a2 81 f8 ff <0f> 0b 48 8b 07 f6 c4 20 75 f1 0f 0b eb ed 0f 1f 40 00 66 66 66>
nov 26 09:51:04 insys.local kernel: RSP: 0018:ffffb2f202317dc0 EFLAGS: 00010246
nov 26 09:51:04 insys.local kernel: RAX: 020000000000a036 RBX: fffff729c4293c80 RCX: 0000000000000000
nov 26 09:51:04 insys.local kernel: RDX: 0000000000000000 RSI: 000000003a094000 RDI: fffff729c4293c80
nov 26 09:51:05 insys.local kernel: RBP: ffff8f3d66e7e000 R08: 0000000000000000 R09: ffffffffc0485600
nov 26 09:51:05 insys.local kernel: R10: ffff8f3cdb70d180 R11: 0000000000000000 R12: 000000000010a4f2
nov 26 09:51:05 insys.local kernel: R13: ffff8f3c86807300 R14: ffff8f3cf5bc3020 R15: 0000000000000000
nov 26 09:51:05 insys.local kernel: FS:  0000000000000000(0000) GS:ffff8f3d7bb00000(0000) knlGS:0000000000000000
nov 26 09:51:05 insys.local kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
nov 26 09:51:05 insys.local kernel: CR2: 00007f8c7b6739f0 CR3: 000000012c722000 CR4: 00000000000406e0
nov 26 09:51:05 insys.local kernel: Call Trace:
nov 26 09:51:05 insys.local kernel:  i915_gem_userptr_put_pages+0x13d/0x190 [i915]
nov 26 09:51:05 insys.local kernel:  __i915_gem_object_put_pages+0x63/0x90 [i915]
nov 26 09:51:05 insys.local kernel:  __i915_gem_free_objects+0x11a/0x210 [i915]
nov 26 09:51:05 insys.local kernel:  __i915_gem_free_work+0x53/0x80 [i915]
nov 26 09:51:05 insys.local kernel:  process_one_work+0x200/0x3c0
nov 26 09:51:05 insys.local kernel:  worker_thread+0x2d/0x3d0
nov 26 09:51:05 insys.local kernel:  ? process_one_work+0x3c0/0x3c0
nov 26 09:51:05 insys.local kernel:  kthread+0x112/0x130
nov 26 09:51:05 insys.local kernel:  ? kthread_create_on_node+0x60/0x60
nov 26 09:51:05 insys.local kernel:  ret_from_fork+0x35/0x40
nov 26 09:51:05 insys.local kernel: ---[ end trace db2080ec69e6f13f ]---

Assignee: pkg-bugs => tmb
CC: (none) => lists.jjorge

Comment 9 José Jorge 2020-04-22 14:47:55 CEST
To make it short : can you try to use kernel 5.3.7 to see if it does not lock?
Morgan Leijström 2020-04-22 14:53:36 CEST

CC: (none) => fri

Comment 10 François PELLEGRINI 2020-04-22 17:46:48 CEST
Dear all,
OK, good to know you have a home reproducer with your daughter's laptop! ;-)
Thanks for the hint. I uploaded Kernel 5.3.7, and will start with it after the next hang.
"The absence of proof is not a proof of absence", but if nothing hangs in the next 5 days, I will get back to you with the info.
Kind regards,
f.p.
Comment 11 François PELLEGRINI 2020-05-05 13:47:56 CEST
Dear all,
I can confirm that reverting to kernel 5.3.7 seems to remove the crashes.
Indeed, that seems to be a regression.
Regards,
f.p.
Comment 12 Morgan Leijström 2020-05-05 17:26:09 CEST
Great, so far.
Before digging further, can you please test if the latest versions now in testing repo works: 

Bug 26570 - Update request: kernel-5.6.8-1.mga7 
Bug 26574 - Update request: x11-driver-video-intel-2.99.917-59.mga7
Bug 26573 - Update request: x11-server-1.20.8-1.mga7
Bug 26571 - Update request: mesa-20.0.6-1.mga7
Comment 13 Morgan Leijström 2020-05-05 17:29:05 CEST
Edit: they are being moved to normal updates, so a normal update when it have reached your mirror will do :)
Comment 14 José Jorge 2020-05-06 09:06:27 CEST
(In reply to Morgan Leijström from comment #13)
> Edit: they are being moved to normal updates, so a normal update when it
> have reached your mirror will do :)

At least the 5.6.6 version still has the bug in my daughter's laptop. And I confirm 5.3.7 never locks. Let's see with the 5.6.8...
Comment 15 François PELLEGRINI 2020-05-06 09:10:59 CEST
Dear all,

MGA Update performed to kernel 5.6.8-1 with updated X11 drivers.
I'm about to reboot. I will tell you in the days to come.  :-)
Regards,
f.p.
Comment 16 François PELLEGRINI 2020-05-06 11:15:13 CEST
Re all,

System just hanged. Grnx.  :-(
So: regression was not fixed by kernel 5.6.8-1 & co.

Regards,
f.p.
Comment 17 Aurelien Oudelet 2021-07-06 13:14:30 CEST
Mageia 7 is EOL since July 1st 2021.
There will not have any further bugfix for this release.

You are encouraged to upgrade to Mageia 8 as soon as possible.

@reporter, if this bug still apply with Mageia 8, please let us know it.

@packager, if you work on the Mageia 7 version of your package, please check the Mageia 8 package if issue is also present. In this case, please fix the Mageia 8 version instead.

This bug report will be closed OLD if there is no further notice within 1st September 2021.
Comment 18 Marja Van Waes 2021-09-07 14:10:10 CEST
Hi bug reporter and hi assignee and others involved,

Please reopen this bug report if it is still valid for Mageia 8 or 9(cauldron), and change "Version:" in the upper left of this report accordingly.

This report is being closed as OLD because it was filed against Mageia 7, for which  support ended on June 30th 2021.

Thanks,
Marja

Status: NEW => RESOLVED
Resolution: (none) => OLD


Note You need to log in before you can comment on or make changes to this bug.