| Summary: | X crashes at random times | ||
|---|---|---|---|
| Product: | Mageia | Reporter: | AL13N <alien> |
| Component: | RPM Packages | Assignee: | QA Team <qa-bugs> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | Normal | CC: | davidwhodgins, fri, herman.viaene, ouaurelien, sysadmin-bugs |
| Version: | 7 | Keywords: | advisory, validated_update |
| Target Milestone: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | MGA7-32-OK MGA7-64-OK | ||
| Source RPM: | x11-server-1.20.9-1.mga7.src.rpm | CVE: | |
| Status comment: | |||
|
Description
AL13N
2020-09-24 17:42:07 CEST
Hi, thanks taking time to report this. In order to understand: - do you run multiple X sessions at same time ? - do each session belong to same user or to totally different users? - how each session is started ? From New user from within Plasma UI or from a startx command in a TTY? Also, what kind of graphic card do you have? CC:
(none) =>
ouaurelien > This happened only after i did a bunch of update a couple of days ago.
> It may be related to the x11-server-1.20.9 update
This looks important. Pity Cauldron offers no package downgrading...
Was the 'bunch of update' too many to list them here? Doing:
$ rpm -qa --last | less
enables all the updates on a particular day/time to be seen and copied/pasted. Not worth it if there are very many; your guess of x11-server looks good, so putting it as the SRPM.
Please confirm the whole version number, which is currently
x11-server-1.20.9-7.mga8CC:
(none) =>
lewyssmith @Lewis: i should note that this ticket is on mga7 (not cauldron), and so, the source rpm is: x11-server-xorg-1.20.9-1.mga7.x86_64
all updates i did at the same time:
[root@localhost ~]# rpm -qa --last | grep '21 sep' | awk '{print $1}' | sort -n
ark-19.04.0-1.2.mga7.x86_64
ark-handbook-19.04.0-1.2.mga7.noarch
cpupower-5.7.19-1.mga7.x86_64
evolution-data-server-3.32.2-1.2.mga7.x86_64
evolution-data-server-tests-3.32.2-1.2.mga7.x86_64
firefox-68.12.0-2.mga7.x86_64
firefox-nl-68.12.0-1.mga7.noarch
iw-5.8-1.mga7.x86_64
kernel-desktop-5.7.19-1.mga7-1-1.mga7.x86_64
kernel-desktop-devel-5.7.19-1.mga7-1-1.mga7.x86_64
kernel-desktop-devel-latest-5.7.19-1.mga7.x86_64
kernel-desktop-latest-5.7.19-1.mga7.x86_64
kernel-userspace-headers-5.7.19-1.mga7.x86_64
lib64cairo2-1.16.0-2.1.mga7.x86_64
lib64camel1.2_62-3.32.2-1.2.mga7.x86_64
lib64ebackend1.2_10-3.32.2-1.2.mga7.x86_64
lib64ebook1.2_19-3.32.2-1.2.mga7.x86_64
lib64ebook-contacts1.2_2-3.32.2-1.2.mga7.x86_64
lib64ecal1.2_19-3.32.2-1.2.mga7.x86_64
lib64edata-book1.2_25-3.32.2-1.2.mga7.x86_64
lib64edata-cal1.2_29-3.32.2-1.2.mga7.x86_64
lib64edataserver1.2_24-3.32.2-1.2.mga7.x86_64
lib64edataserverui1.2_2-3.32.2-1.2.mga7.x86_64
lib64gpac7-0.7.1-6.1.mga7.tainted.x86_64
lib64kerfuffle19-19.04.0-1.2.mga7.x86_64
lib64llvm8.0-8.0.0-1.1.mga7.x86_64
lib64lua5.2-5.2.4-3.1.mga7.x86_64
lib64lua5.3-5.3.5-2.1.mga7.x86_64
lib64mlt++3-6.16.0-1.1.mga7.x86_64
lib64mlt6-6.16.0-1.1.mga7.x86_64
lib64nspr4-4.28-1.mga7.x86_64
lib64pq5-11.9-1.mga7.x86_64
lib64qt3support4-4.8.7-26.2.mga7.x86_64
lib64qt4-database-plugin-mysql-4.8.7-26.2.mga7.x86_64
lib64qt4-database-plugin-sqlite-4.8.7-26.2.mga7.x86_64
lib64qt5concurrent5-5.12.6-4.mga7.x86_64
lib64qt5core5-5.12.6-4.mga7.x86_64
lib64qt5-database-plugin-ibase-5.12.6-4.mga7.x86_64
lib64qt5-database-plugin-mysql-5.12.6-4.mga7.x86_64
lib64qt5-database-plugin-sqlite-5.12.6-4.mga7.x86_64
lib64qt5dbus5-5.12.6-4.mga7.x86_64
lib64qt5eglfsdeviceintegration5-5.12.6-4.mga7.x86_64
lib64qt5eglfskmssupport5-5.12.6-4.mga7.x86_64
lib64qt5gui5-5.12.6-4.mga7.x86_64
lib64qt5network5-5.12.6-4.mga7.x86_64
lib64qt5opengl5-5.12.6-4.mga7.x86_64
lib64qt5printsupport5-5.12.6-4.mga7.x86_64
lib64qt5sql5-5.12.6-4.mga7.x86_64
lib64qt5test5-5.12.6-4.mga7.x86_64
lib64qt5widgets5-5.12.6-4.mga7.x86_64
lib64qt5xcbqpa5-5.12.6-4.mga7.x86_64
lib64qt5xml5-5.12.6-4.mga7.x86_64
lib64qtclucene4-4.8.7-26.2.mga7.x86_64
lib64qtcore4-4.8.7-26.2.mga7.x86_64
lib64qtdbus4-4.8.7-26.2.mga7.x86_64
lib64qtdeclarative4-4.8.7-26.2.mga7.x86_64
lib64qtdesigner4-4.8.7-26.2.mga7.x86_64
lib64qtgui4-4.8.7-26.2.mga7.x86_64
lib64qthelp4-4.8.7-26.2.mga7.x86_64
lib64qtmultimedia4-4.8.7-26.2.mga7.x86_64
lib64qtnetwork4-4.8.7-26.2.mga7.x86_64
lib64qtopengl4-4.8.7-26.2.mga7.x86_64
lib64qtscript4-4.8.7-26.2.mga7.x86_64
lib64qtscripttools4-4.8.7-26.2.mga7.x86_64
lib64qtsql4-4.8.7-26.2.mga7.x86_64
lib64qtsvg4-4.8.7-26.2.mga7.x86_64
lib64qttest4-4.8.7-26.2.mga7.x86_64
lib64qtxml4-4.8.7-26.2.mga7.x86_64
lib64qtxmlpatterns4-4.8.7-26.2.mga7.x86_64
lib64raw19-0.19.2-1.1.mga7.x86_64
lib64raw_r19-0.19.2-1.1.mga7.x86_64
lib64sane1-1.0.28-1.1.mga7.x86_64
lib64x11_6-1.6.12-1.mga7.x86_64
lib64x11-xcb1-1.6.12-1.mga7.x86_64
lib64x264_155-0.155-0.20181228.stable.1.1.mga7.tainted.x86_64
lib64xfs1-5.8.0-1.mga7.x86_64
libx11_6-1.6.12-1.mga7.i586
libx11-common-1.6.12-1.mga7.x86_64
lua-5.2.4-3.1.mga7.x86_64
mlt-6.16.0-1.1.mga7.x86_64
mlt-kdenlive-6.16.0-1.1.mga7.x86_64
perf-5.7.19-1.mga7.x86_64
polkit-kde-agent-1-5.15.4-1.1.mga7.x86_64
qt4-common-4.8.7-26.2.mga7.x86_64
qt4-qtdbus-4.8.7-26.2.mga7.x86_64
qt4-xmlpatterns-4.8.7-26.2.mga7.x86_64
qtbase5-common-5.12.6-4.mga7.x86_64
sane-backends-1.0.28-1.1.mga7.x86_64
sane-backends-iscan-1.0.28-1.1.mga7.x86_64
saned-1.0.28-1.1.mga7.x86_64
x11-server-common-1.20.9-1.mga7.x86_64
x11-server-xorg-1.20.9-1.mga7.x86_64
x11-server-xwayland-1.20.9-1.mga7.x86_64
x264-0.155-0.20181228.stable.1.1.mga7.tainted.x86_64
xfsprogs-5.8.0-1.mga7.x86_64
this is an nvidia graphic card, but i don't think it's relevant here, because the nvidia was not updated (except maybe also the kernel), but the backtrace does not actually contain any nvidia code...
The way i think this could be reproducable is:
- after reboot, log into plasma with one user
- lock with the user
- on lock screen, start new session (i'm using sddm, but i donno if that is relevant)
- log into a new plasma session with another user
- do this until 4 users are logged in with each different plasma sessions
- maybe wait until the screen goes into standby?
- then wake up screen and you get just a blinking cursor on black screen, but the sddm seems to still be alive (or restarted?), even if all the other sessions are gone
the problem is that i don't know what all of the users having locked screen have open when locked and not always this happens, but:
for me:
- i did updates
- then i had some weird issues and i figured i forgot to reboot after updates, so i reboot
- i think the next day, my wife says, something's wrong, i get just black screen
- so i log in remotely, see loginctl showing sessions, but it seems the X servers are gone? i figure, i'll just reboot
- so i reboot
- next day or maybe day after, same thing happens, this time i am not remote, so i looked at it and started a new session from the sddm for me, and locked it, then i looked around and found the backtraces, i looked into the backtrace but could only found older issues, or something with glamor, which it seems we don't have on mageia? or at least i don't seem to have it and it doesn't appear in my backtraces...
I hope this helps, but i posted this, mostly to see/find if other people have these issues as i do...
AL13N
2020-09-25 09:03:12 CEST
Source RPM:
x11-server-1.20.9-7.mga8.src.rpm =>
x11-server-1.20.9-1.mga7.src.rpm > i should note that this ticket is on mga7 (not cauldron), and so,
> the source rpm is: x11-server-xorg-1.20.9-1.mga7.x86_64
My mistake; thank you for correcting it.
Those updates included a lot of Qt stuff as well as X11.
Dare one ask whether you have another desktop available, to see whether the same thing happens with that? You can still use all the same KDE applications. It would be good to know whether this is Plasma or Xserver related, but do not install another one especially. (If you do anyway, LXDE or Xfce are about the lightest: task-lxde-minimal, task-xfce-minimal).
(In reply to AL13N from comment #3) > this is an nvidia graphic card, but i don't think it's relevant here, > because the nvidia was not updated (except maybe also the kernel), but the > backtrace does not actually contain any nvidia code... Sorry to say it, but nvidia CAN be the culprit here. If you use nonfree drivers, sometimes, it really dislike the switching user functionality and X can't resume from a VT switch, especially when there are more than 1 X server. Sometimes, you get from resume in Plasma on M7. This has been fixed in QT 5.14 and nvidia 450 series. See also: http://blog.davidedmundson.co.uk/blog/plasma-rendering-handling-nvidia-context-loss/ As we are currently with QT5.12 in Mageia 7, we can't fix it. So my question is: what is the driver you are using? I guess it could be Qt or the non-free nvidia driver that i'm using. I looked at similar but not quite the same backtraces from a while ago, and it seems that in most of those case, the backtrace has a nvidia code point in the backtrace. Similarly, there is only specific X code points (maybe related to fonts) in my backtraces. Further, I do have often points where fonts are (partially) disappearing, especially if my son is doing lotsa games (and i have seen mentions of video memory running out). but this only happens since the last update, so, if it is the nvidia context loss thing, i guess i'll try to downgrade X first maybe with the Xlib, if it does not improve, and if that still doesn't work, try to downgrade all those qt libraries? maybe? and hope that mga8 is out hopefully fairly quick with Qt6.14....? Or i guess i could try nouveau, though that likely will have other issues and i'm not certain this would have any effect... I figured (hoped) considering the backtrace was all fully X, that it would not be related... though it does happen after a funlockfile which may happen when context are different, and a loss of context might explain why something as simple as looking up some colors would fail... At least with Mageia 7 you can downgrade packages. This is a sensible thing to do if a batch of updates introduces a problem, to try and pin it down - and avoid it. (In reply to Lewis Smith from comment #7) > At least with Mageia 7 you can downgrade packages. This is a sensible thing > to do if a batch of updates introduces a problem, to try and pin it down - > and avoid it. didn't Mageia packages always been downgradable? Downgraded x11-server-* , waiting for the bug to trigger... Here were the earlier backtraces: [440088.034] (EE) Backtrace: [440088.034] (EE) 0: /usr/libexec/Xorg (OsLookupColor+0x139) [0x58d0a9] [440088.034] (EE) 1: /lib64/libpthread.so.0 (funlockfile+0x50) [0x7ff6c1afb5bf] [440088.034] (EE) 2: /usr/libexec/Xorg (ConfineToShape+0x943) [0x445ad3] [440088.035] (EE) 3: /usr/libexec/Xorg (MaybeDeliverEventsToClient+0x15b4) [0x44a544] [440088.035] (EE) 4: /usr/libexec/Xorg (WindowsRestructured+0x3a) [0x44922a] [440088.035] (EE) 5: /usr/libexec/Xorg (MapWindow+0x145) [0x466995] [440088.035] (EE) 6: /usr/libexec/Xorg (CompositeRegisterImplicitRedirectionException+0x8b0) [0x4c3350] [440088.035] (EE) 7: /usr/libexec/Xorg (xf86I2CGetScreenBuses+0xd80) [0x4c1f10] [440088.035] (EE) 8: /usr/libexec/Xorg (SendErrorToClient+0x345) [0x43c645] [440088.035] (EE) 9: /usr/libexec/Xorg (InitFonts+0x3a6) [0x4404b6] [440088.035] (EE) 10: /lib64/libc.so.6 (__libc_start_main+0xeb) [0x7ff6c193eb0b] [440088.036] (EE) 11: /usr/libexec/Xorg (_start+0x2a) [0x42b02a] [440088.036] (EE) [440088.036] (EE) Segmentation fault at address 0x18 [475307.388] (EE) Backtrace: [475307.388] (EE) 0: /usr/libexec/Xorg (OsLookupColor+0x139) [0x58d0a9] [475307.388] (EE) 1: /lib64/libpthread.so.0 (funlockfile+0x50) [0x7f539a1335bf] [475307.388] (EE) 2: /usr/libexec/Xorg (ConfineToShape+0x943) [0x445ad3] [475307.388] (EE) 3: /usr/libexec/Xorg (WindowHasNewCursor+0x37) [0x446617] [475307.388] (EE) 4: /usr/libexec/Xorg (ChangeWindowAttributes+0xd1d) [0x4688fd] [475307.388] (EE) 5: /usr/libexec/Xorg (ProcBadRequest+0x1ec) [0x436bec] [475307.389] (EE) 6: /usr/libexec/Xorg (SendErrorToClient+0x345) [0x43c645] [475307.389] (EE) 7: /usr/libexec/Xorg (InitFonts+0x3a6) [0x4404b6] [475307.389] (EE) 8: /lib64/libc.so.6 (__libc_start_main+0xeb) [0x7f5399f76b0b] [475307.389] (EE) 9: /usr/libexec/Xorg (_start+0x2a) [0x42b02a] [475307.389] (EE) [475307.389] (EE) Segmentation fault at address 0x18 [506843.626] (EE) Backtrace: [506843.626] (EE) 0: /usr/libexec/Xorg (?+0x0) [0x58cf70] [506843.626] (EE) 1: /lib64/libpthread.so.0 (funlockfile+0x50) [0x7f91aa5135bf] [506843.626] (EE) 2: /usr/libexec/Xorg (?+0x0) [0x445190] [506843.626] (EE) 3: /usr/libexec/Xorg (?+0x0) [0x448f90] [506843.627] (EE) 4: /usr/libexec/Xorg (?+0x0) [0x4491f0] [506843.627] (EE) 5: /usr/libexec/Xorg (?+0x0) [0x582b70] [506843.627] (EE) 6: /usr/libexec/Xorg (?+0x0) [0x4ca3f0] [506843.627] (EE) 7: /usr/libexec/Xorg (?+0x0) [0x4ca560] [506843.627] (EE) 8: /usr/libexec/Xorg (?+0x0) [0x4cac80] [506843.627] (EE) 9: /usr/libexec/Xorg (?+0x0) [0x43c300] [506843.627] (EE) 10: /usr/libexec/Xorg (?+0x0) [0x440110] [506843.627] (EE) 11: /lib64/libc.so.6 (__libc_start_main+0xeb) [0x7f91aa356b0b] [506843.627] (EE) 12: /usr/libexec/Xorg (?+0x0) [0x42b000] [506843.628] (EE) [506843.628] (EE) Segmentation fault at address 0x18 this last one is probably when the update was done, but the old one was still in memory. (In reply to AL13N from comment #8) > didn't Mageia packages always been downgradable? Not if you are running Cauldron (in this case, Mageia 8 pre-release). At present, most bugs relate to that. Ok, it seems rolling back to x11-server-1.20.8 actually fixed the problem. nothing else was required, i didn't have any kind of issues i had since upgrading, which was about a week ago... I guess i should wait a bit longer, just in case, but this is really promising... I guess some error got introduced? Who is the Xorg maintainer? maybe this should be reported upstream? while it may be related to either the nvidia losing context or even running out of video memory; I think a change from this resulted from earlier maybe undefined graphic artifacts, to a hard crash? WDYT? after looking around in xorg server code and issues, i found a few regression fixes between 1.20.8 and 1.20.9 , which seem similar. There is already a fix present in the next 1.20 version, maybe we could add this patch to our current version in order to avoid waiting 6 more months for the next xorg-xserver version...? see https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/504 I've put the patches tv put into cauldron into mga7 to fix this issue: suggested advisory for bugfix: This update fixes an occasional crash in x11-server that was fixed in upstream 1.20 stable branch. SRPM in core/updates_testing: x11-server-1.20.9-1.1.mga7 needed rpm is just x11-server-xorg
AL13N
2020-10-11 22:30:00 CEST
Assignee:
bugsquad =>
qa-bugs Thank you for the update I have had two black screen lockups during night last two months, but have not had enough incentive/time to check logs. I hope this update helps. Anyway i let my main machine run it from now on :) OK so far: 64 bit, Plasma, Nvidia with CUDA in heavy use (BOINC), VirtualBox running MSW7, Firefox, Libreoffice... My machine "svarten": Mainboard: Sabertooth P67, CPU: i7-3770, RAM 16G, Nvidia GTX760 (GK104) using nvidia-current; GeForce 635 series and later, 4k display. CC:
(none) =>
fri If you have backtraces in /var/log/X.*.log that look like above, then it seems this might be the same issue I had. Thanks for testing this... Too long ago now, but i will look if it happens again Just a word to congratulate & thank AL13N for not just identifying, but implementing the fix for this difficult problem. Thanks also to Morgan for trying it. Note to QA: all details are in comments 12 & 13. Because the package is just x11-server-xorg, everybody can try it non-specifically. CC:
lewyssmith =>
(none) (In reply to Morgan Leijström from comment #16) > Too long ago now, but i will look if it happens again For me, i had logs for the last 5 xorg restarts, so, unless you turn off this machine regularly, it might still be there... (In reply to Lewis Smith from comment #17) > Just a word to congratulate & thank AL13N for not just identifying, but > implementing the fix for this difficult problem. No need for thanks, I'm from packaging team, so, this would be my job. If anything, I should be admonished because I've been inactive for quite a while now... I will thank tv for replying to me on IRC and putting those patches on cauldron, so I only had to copy his work, and I would've likely only put the single patch in, instead of all upstream patches in the 1.20 branch. But yes, anyone can test even if you don't have the same issue, if X still works just as well for you, it's also a successful test... I have a question on QA policy though, does it still need to be tested for 32bit? I would guess that like 90% is 64bit these days; and this package is also for armv7hl and aarch64 ...? We only require testing on x86_64, but normally do test on 32 bit as well, even if only under virtualbox. CC:
(none) =>
davidwhodgins MGA7-64 Plasma on Lenovo B50 No installation issues. Rebooted after installation, exercised different file types(odt, ods, pdf, avi and more). No problems detected. I will not object OK, but maybe other tests could be usefull. CC:
(none) =>
herman.viaene $ urpmf --sourcerpm --media "Core Updates Testing (distrib5)" x11-server x11-server-source:x11-server-1.20.9-1.1.mga7.src.rpm x11-server-xorg:x11-server-1.20.9-1.1.mga7.src.rpm x11-server-xnest:x11-server-1.20.9-1.1.mga7.src.rpm x11-server-devel:x11-server-1.20.9-1.1.mga7.src.rpm x11-server:x11-server-1.20.9-1.1.mga7.src.rpm x11-server-xdmx:x11-server-1.20.9-1.1.mga7.src.rpm x11-server-xwayland:x11-server-1.20.9-1.1.mga7.src.rpm x11-server-xvfb:x11-server-1.20.9-1.1.mga7.src.rpm x11-server-common:x11-server-1.20.9-1.1.mga7.src.rpm x11-server-xephyr:x11-server-1.20.9-1.1.mga7.src.rpm Installed the updates for the x11-server packages already installed in an i586 vb guest, and rebooted. $ rpm -qa|grep x11-server x11-server-xwayland-1.20.9-1.1.mga7 x11-server-common-1.20.9-1.1.mga7 x11-server-xorg-1.20.9-1.1.mga7 No regressions found. Also tested on x86_64 on the host, and in a x86_64 guest. Validating the update. Whiteboard:
(none) =>
MGA7-32-OK MGA7-64-OK An update for this issue has been pushed to the Mageia Updates repository. https://advisories.mageia.org/MGAA-2020-0210.html Resolution:
(none) =>
FIXED |