Bug 28192

Summary: startplasma-wayland or kwin_wayland crash with SIGsegv (signal 11)
Product: Mageia Reporter: peter winterflood <peter.winterflood>
Component: RPM PackagesAssignee: KDE maintainers <kde>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: High CC: geiger.david68210, ouaurelien, peter.winterflood
Version: 8   
Target Milestone: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Source RPM: kdeclarative-5.76.0-2.mga8.src.rpm CVE:
Status comment: confirmed the issue is caused by the upgrade from lib64kf5quickaddons5-5.76.0-1.mga8
Attachments: straced issue
journal log
packages upgraded, many of course but one of these is root cause.
a packagelist from my amd wayland machine

Description peter winterflood 2021-01-21 23:49:55 CET
Description of problem:
First and most obvious symptom is login using plasma (wayland) from sddm, and it simply logs you back out.

 

Version-Release number of selected component (if applicable):

seems to be after an update that came in recently so probably versions 5.20-4-1

How reproducible:
easy

Steps to Reproduce:

init 2, and login as same user

startplasma-wayland

"/usr/bin/kwin_wayland" ("--xwayland", "--exit-with-session=/usr/libexec/startplasma-waylandsession") exited with code 11
startplasmacompositor: Shutting down...
startplasmacompositor: Done. 

dig a little deeper, strace -f kwin_wayland, and it exist with the last few lines as so

3058  --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x60} ---
3058  prctl(PR_SET_DUMPABLE, SUID_DUMP_USER) = 0
3058  rt_sigaction(SIGSEGV, {sa_handler=SIG_IGN, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fc9fedc3580}, {sa_handler=0x416e00, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fc9fedc3580}, 8) = 0
3058  rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [USR1 SEGV USR2], 8) = 0
3058  getpid()                          = 3058
3058  gettid()                          = 3058
3058  tgkill(3058, 3058, SIGSEGV)       = 0
3058  rt_sigprocmask(SIG_SETMASK, [USR1 SEGV USR2], NULL, 8) = 0
3058  rt_sigreturn({mask=[USR1 USR2]})  = 140728451411552
3058  --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_TKILL, si_pid=3058, si_uid=1000} ---
3058  --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x60} ---
3058  +++ killed by SIGSEGV (core dumped) +++

also a 

journalctl -b -r

Jan 21 21:24:52 localhost kernel: Code: 02 f3 ff 4c 89 ef e8 2f 5b f5 ff 48 89 ef e8 27 5b f5 ff 48 83 c4 10 4c 89 e0 5d 41 5c 41 5d c3 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b 47 60 c3 90 66 2e 0f 1f 84 00 00 00 00 00 48 89 77 60 c3 90
Jan 21 21:24:52 localhost kernel: kwin_wayland[1825]: segfault at 60 ip 00007f2ec1dc3180 sp 00007ffe76bd3c88 error 4 in libkwin.so.5.20.4[7f2ec1cea000+1ae000]

hence why i think the issue is in  lib64kwin5-5.20.4-1.mga8, but it could just as easily be in kwin_wayland itself.

tested on two completly different systems, one a second gen i7 with built in graphics, and one a threadripper with nvidia 1070ti.

i will ask for this to be a showstopper as a crashing technology eval is not a technology eval.
Comment 1 peter winterflood 2021-01-21 23:53:09 CET
Created attachment 12243 [details]
straced issue

strace
Comment 2 peter winterflood 2021-01-21 23:53:42 CET
Created attachment 12244 [details]
journal log

journallog
Comment 3 peter winterflood 2021-01-21 23:56:20 CET
Created attachment 12245 [details]
packages upgraded, many of course but one of these is root cause.
Comment 4 peter winterflood 2021-01-22 22:07:45 CET
updating to release blocker

Priority: Normal => release_blocker

peter winterflood 2021-01-22 22:08:01 CET

Version: Cauldron => 8

Comment 5 Thomas Backlund 2021-01-22 22:15:54 CET
This looks like:
https://bugs.kde.org/show_bug.cgi?id=420039
Comment 6 peter winterflood 2021-01-22 22:37:08 CET
had a look at that bug report and while the signal 11 is the same, googling this problem, to see if any other linux had it revealed many occurences, subsequently fixed by updates.
this is recent.
this is from updates that got pushed in last few days, not 8 months ago.
regards peter
Comment 7 peter winterflood 2021-01-23 10:31:39 CET
Created attachment 12249 [details]
a packagelist from my amd wayland machine

this is a packagelist ordered by --last from my amd graphics based kwin_wayland box, running on an amd phenom II 955.
that i have not updated.
its still working with kwin_wayland, so has not yet received the faulty package, or combination of packages.

so just to add, ive looked at that kde bug report again, that only crashes when you do something, ie move a window it establishes a working desktop first. at least thats my impression.
in this reports case it crashes immediately. never getting a desktop.
also worth noting, when i startplasma-wayland remotly ie via an ssh tunnel 

[peter@localhost ~]$ ssh ossi6 -l peter
Password: 
Warning: No xauth data; using fake authentication data for X11 forwarding.
Last login: Thu Jan 21 22:29:07 2021
[peter@localhost ~]$ startpla
startplasma-wayland  startplasma-x11      
[peter@localhost ~]$ startplasma-wayland 
"/usr/bin/kwin_wayland" ("--xwayland", "--exit-with-session=/usr/libexec/startplasma-waylandsession") exited with code 11
startplasmacompositor: Shutting down...
startplasmacompositor: Done.
[peter@localhost ~]$ 
[peter@localhost ~]$ echo $DISPLAY
localhost:10.0
[peter@localhost ~]$ 


it also crashes, so that rules out device driver, i also tried booting off previous kernel to see if that made a difference, but it did not.
happy to try anything else you want to help peruse this, but i think any person investigating this will find it realy easy to reproduce its such a hard error
regards peter

CC: (none) => peter.winterflood

Comment 8 peter winterflood 2021-01-23 11:28:58 CET
bit more info installed debuninfo for both kwin-wayland and lib64kwin and run under GDB

Starting program: /usr/bin/kwin_wayland 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff0236640 (LWP 11388)]

Thread 1 "kwin_wayland" received signal SIGSEGV, Segmentation fault.
KWin::Platform::sceneEglGlobalShareContext (this=0x0)
    at /usr/src/debug/kwin-5.20.4-1.mga8.x86_64/platform.cpp:573
573         return m_globalShareContext;

hope that helps
regards peter
Comment 9 peter winterflood 2021-01-23 12:29:50 CET
ok ive narrowed down the package at fault

upgrading lib64kf5quickaddons5-5.76.0-1.mga8 to 

lib64kf5quickaddons5-5.76.0-2.mga8  introduces the crash

regards peter

Status comment: (none) => confirmed the issue is caused by the upgrade from lib64kf5quickaddons5-5.76.0-1.mga8
Source RPM: suspect lib64kwin5-5.20.4-1.mga8 => lib64kf5quickaddons5-5.76.0-2.mga8

Comment 10 David GEIGER 2021-01-23 13:04:54 CET
Should kwin be rebuilded against kdeclarative?

It is this commit the fault: http://svnweb.mageia.org/packages?view=revision&revision=1671792

CC: (none) => geiger.david68210

Comment 11 Lewis Smith 2021-01-23 17:57:37 CET
(In reply to peter winterflood from comment #9)
> ok ive narrowed down the package at fault
> upgrading lib64kf5quickaddons5-5.76.0-1.mga8 to 
> lib64kf5quickaddons5-5.76.0-2.mga8  introduces the crash
This is very useful information. Thank you for the detective work.

Assigning to the KDE group; DavidG is already CC'd (hint).

Plasma on Wayland is, I believe, still experimental, not to be relied on.

Assignee: bugsquad => kde
Source RPM: lib64kf5quickaddons5-5.76.0-2.mga8 => kdeclarative-5.76.0-2.mga8.src.rpm

Comment 12 Lewis Smith 2021-01-23 18:05:06 CET
In the light of that last comment, I do not think this justifies Release blocker/critical, since Plasma on Wayland is more a novelty than a necessity. 
For ERRATA if necessary.
Plasma on X11 works conventionally.
@Peter : Feel free to bump it back up if you can justify that (like I have misunderstood things).

Severity: critical => major
Keywords: (none) => FOR_ERRATA8
Priority: release_blocker => High

Comment 13 peter winterflood 2021-01-23 18:19:23 CET
while the actual statement used on the mageia wiki is plasma on wayland is technology evaluation, even if you say that is experimental.
releasing it as broken just seems , well i dont have words for what it seems.

i agree plasma, wayland nvidia is very unreliable, on AMD and intel its quite usable, in fact im typing this on my laptop now ive rolled back the update.

ive been using my laptop exclusively with wayland for weeks now.

so no i disagree , with releasing something that crashes.

personally id role back the patch that broke this, until that is done in a way that it does not break things.

but i never had a problem that that was supposed to fix, except perhaps after updating packages, and that was easy to work around.

regards peter
Comment 14 David GEIGER 2021-01-24 09:29:09 CET
Can you test please with kwin-5.20.4-2.mga8 and lib64kf5quickaddons5-5.76.0-2.mga8?
Comment 15 peter winterflood 2021-01-24 11:28:55 CET
already had before you sent this last night.
updated to everyting in updates-testing, tested login, with plasma wayland,
kicked back out imeadiatly
restord the 01 rpm, works fine


just rechecked laptop to confirm it has the -2 kwin wayland rpm
thanks for continuing to look at this.

regards peter winterflood
Comment 16 peter winterflood 2021-01-24 11:50:07 CET
can we find out if this patch is in kde neon 20 im going to download that today, and see if kwin-wayland is crashing on that platform.
regards peter
Comment 17 Aurelien Oudelet 2021-01-24 12:10:51 CET
Yeah Peter,

The issue is to wonder if we should just upgrade all KFrameworks to latest version (5.78) as we are still in 5.76 and Plasma to 5.20.5 as we are still at 5.20.4.

Even there are fix for Plasma-wayland in Plasma Workspace 5.21 which is released in Beta two days ago.

As we are in freeze to RC... Plasma has strange roadmap and release fix to newer version instead of fixing existing bugs in existing release. But, Linux kernel is same.


That's why we wrote "technology preview" for plasma wayland in release notes. It is not intended to be used in production environment.

CC: (none) => ouaurelien

Comment 18 peter winterflood 2021-01-24 13:47:07 CET
this is a known issue down at KDE and as i suggested it was caused by that quickaddons patch being a cludge

https://invent.kde.org/frameworks/kdeclarative/-/commit/81c14c916e70de61395eeb44d1d7fa6094eabe93

The new opengl correctness detection doesn't play nicely with
kwin_wayland as our QPA behaves in a special way and isn't actually
ready to create GLContexts till other events have processed.

We can fix this in kwin moving forwards. Worst case we can drop
QtQuickSettings, but we need to be sure we don't break existing
releases.

kde neon 20.5 does not have this issue.
Comment 19 David GEIGER 2021-01-24 14:50:44 CET
So I added this patch in upcoming kdeclarative-5.76.0-3.mga8!

Let's see if it fixes or not the wayland issue.
Comment 20 peter winterflood 2021-01-24 16:19:39 CET
looking forward to it David, ill watch the repos for it
Comment 21 peter winterflood 2021-01-24 17:57:39 CET
well done david

[peter@localhost ~]$ rpm -qa --last|more
lib64kf5calendarevents5-5.76.0-3.mga8.x86_64  Sun 24 Jan 2021 16:52:41 GMT
lib64kf5quickaddons5-5.76.0-3.mga8.x86_64     Sun 24 Jan 2021 16:52:40 GMT
lib64kf5declarative5-5.76.0-3.mga8.x86_64     Sun 24 Jan 2021 16:52:40 GMT
kdeclarative-5.76.0-3.mga8.x86_64             Sun 24 Jan 2021 16:52:40 GMT

[peter@localhost ~]$ ps -ef|grep kwin
peter       8979    8944  7 16:53 tty2     00:00:13 /usr/bin/kwin_wayland --xwayland --exit-with-session=/usr/libexec/startplasma-waylandsession

that fixed it 

writing this on working plasma nvidia wayland desktop.

will test my laptop later, but no crash

regards peter
Comment 22 peter winterflood 2021-01-24 18:13:15 CET
tested twice works on laptop as well, where i originally discovered it.
amd graphics box is unfortunately running kde neon at the moment, but ill 
be rebuilding that with mageia rc asap.

regards peter winterflood

Status: NEW => RESOLVED
Resolution: (none) => FIXED

Comment 23 Aurelien Oudelet 2021-01-24 18:41:29 CET
Peter, you're right.

Before applying above updates,
Wayland session goes directly to restarting sddm.

/home/aurelien/.local/share/sddm/wayland-session.log only displays:
"/usr/bin/kwin_wayland" ("--xwayland", "--exit-with-session=/usr/libexec/startplasma-waylandsession") exited with code 11
startplasmacompositor: Shutting down...
startplasmacompositor: Done.

Updating to kdeclarative-5.76.0-3.mga8 and KWin 5.20.4-2.mga8 resolve the issue.

Keywords: FOR_ERRATA8 => (none)