Bug 26942 - GDM and gnome-shell crash on startup due to segfault in libmutter (32-bit only)
Summary: GDM and gnome-shell crash on startup due to segfault in libmutter (32-bit only)
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: release_blocker critical
Target Milestone: ---
Assignee: GNOME maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-13 10:15 CEST by Martin Whitaker
Modified: 2020-07-25 11:29 CEST (History)
3 users (show)

See Also:
Source RPM: gnome-shell-3.37.3-1.mga8.src.rpm, mutter-3.37.3-1.mga8.src.rpm
CVE:
Status comment:


Attachments
Backtrace from first segfault (starting Wayland session) (2.97 KB, text/plain)
2020-07-13 10:15 CEST, Martin Whitaker
Details
Backtrace from second segfault (starting Xorg session) (2.39 KB, text/plain)
2020-07-13 10:16 CEST, Martin Whitaker
Details
Patch to fix the fault (6.86 KB, text/plain)
2020-07-17 13:26 CEST, Martin Whitaker
Details

Description Martin Whitaker 2020-07-13 10:15:52 CEST
Created attachment 11748 [details]
Backtrace from first segfault (starting Wayland session)

On an up-to-date cauldron system. When using GDM, the journal shows the following three segfaults:

kernel: gnome-shell[1563]: segfault at 14 ip b6c294f4 sp bfcc2ee0 error 4 in libmutter-7.so.0.0.0[b6b15000+127000]

gnome-shell[1675]: segfault at fffffff0 ip b6b697a4 sp bfd0816c error 5 in libmutter-7.so.0.0.0[b6b41000+127000]

gnome-shell[1693]: segfault at fffffff0 ip b6bbf7a4 sp bfcd218c error 5 in libmutter-7.so.0.0.0[b6b97000+127000]

The first comes when attempting to start a Wayland session, the second and third from attempting to start an Xorg session.

When using another DM (LightDM) you just get the two segfaults from attempting to start an Xorg session.

Downgrading gnome-shell and mutter to 3.37.2-1 fixes the problem.

Backtraces attached.
Comment 1 Martin Whitaker 2020-07-13 10:16:17 CEST
Created attachment 11749 [details]
Backtrace from second segfault (starting Xorg session)
Comment 2 Martin Whitaker 2020-07-13 14:19:42 CEST
git bisect identifies this as the mutter commit that introduces the fault:

commit 6bd382ad23ded1f18b925829c33a47d29fb5bddb (HEAD)
Author: Georges Basile Stavracas Neto <georges.stavracas@gmail.com>
Date:   Mon Jun 8 22:02:34 2020 -0300

    background-actor: Use MetaBackgroundContent
    
    MetaBackgroundActor is still necessary for culling purposes,
    but now the actual rendering of the background is delegated
    to MetaBackgroundContent, as well as the sizing information.
    
    https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1302
Comment 3 Pascal Terjan 2020-07-13 14:46:58 CEST
I forwarded this upstream at https://gitlab.gnome.org/GNOME/mutter/-/issues/1343

CC: (none) => pterjan

Olav Vitters 2020-07-14 10:29:18 CEST

CC: (none) => olav
Assignee: bugsquad => gnome

Olav Vitters 2020-07-15 11:52:25 CEST

See Also: (none) => https://gitlab.gnome.org/GNOME/mutter/-/issues/1343

Comment 4 Olav Vitters 2020-07-16 10:33:55 CEST
Seeing the lack of response from upstream and that Fedora does have the same version I looked at possible differences. Fedora only has a harmless patch. I've added it but doubt it'll work. It's weird that it is arch specific.
Comment 5 Olav Vitters 2020-07-16 10:35:40 CEST
Martin: Could you describe your setup? Laptop/pc, how many monitors, GPUs, etc?
Comment 6 Martin Whitaker 2020-07-16 11:39:12 CEST
I reproduced this in VirtualBox and in virt-manager, both running on a 64-bit mga7 host. The VirtualBox guest settings were 4 CPUs, 4GB RAM, VBoxSVGA video controller with 3D acceleration, 32MB video memory, 1 monitor. The virt-manager settings were 4 CPUs, 4GB RAM, QXL video, 16MB video memory, 1 head.

The bug was originally reported by Ben on qa-discuss - at least I assumed it was the same bug. Ben, can you say what hardware you tested this on, and whether you were testing a 32-bit or 64-bit install.

CC: (none) => westel

Comment 7 Olav Vitters 2020-07-16 14:20:43 CEST
To confirm it's a 32bit arch running under 64bit host? Are you also able to test the new mutter build (success rate probably close to 0%).
Comment 8 Martin Whitaker 2020-07-16 14:38:51 CEST
Correct. I don't have any real 32-bit hardware to test on. I have just tried a bare-metal install (again a 32-bit system on 64-bit hardware) and see the same error. As expected, the new mutter build hasn't helped.

I suspect the git bisect results are misleading. I've tried just reverting that commit, keeping all the other 3.37.3 changes, and still get a failure. I start to suspect an uninitialised variable or use-after-free bug.
Comment 9 Olav Vitters 2020-07-16 15:07:14 CEST
bug 26958 mentions a lack of hardware acceleration. Not sure how to check if llvmpipe is used without looking at gnome-control-center.
Comment 10 Olav Vitters 2020-07-16 16:53:29 CEST
Another thought: downgrade glib to a stable version. Seems XFCE had to make some changes due to the unstable glib.
Comment 11 Martin Whitaker 2020-07-16 18:41:50 CEST
(In reply to Olav Vitters from comment #10)
> Another thought: downgrade glib to a stable version. Seems XFCE had to make
> some changes due to the unstable glib.

I had that thought too, but sadly it doesn't help with this bug.
Comment 12 Ben McMonagle 2020-07-16 21:46:53 CEST
(In reply to Martin Whitaker from comment #6)
> I reproduced this in VirtualBox and in virt-manager, both running on a
> 64-bit mga7 host. The VirtualBox guest settings were 4 CPUs, 4GB RAM,
> VBoxSVGA video controller with 3D acceleration, 32MB video memory, 1
> monitor. The virt-manager settings were 4 CPUs, 4GB RAM, QXL video, 16MB
> video memory, 1 head.
> 
> The bug was originally reported by Ben on qa-discuss - at least I assumed it
> was the same bug. Ben, can you say what hardware you tested this on, and
> whether you were testing a 32-bit or 64-bit install.

install is 32bit all DE real hardware install.
Toshiba Portege R930 with  ssd.
when the issue  occurred, I changed DM to SDDM
Comment 13 Olav Vitters 2020-07-17 10:50:17 CEST
(In reply to ben mcmonagle from comment #12)
> install is 32bit all DE real hardware install.
> Toshiba Portege R930 with  ssd.
> when the issue  occurred, I changed DM to SDDM

Did you also switch from GNOME / gnome-shell to something else? Or does SDDM allow you to start gnome-shell?
Comment 14 Martin Whitaker 2020-07-17 13:26:06 CEST
Created attachment 11751 [details]
Patch to fix the fault

This fixes the fault, tested both in a VM and on real hardware. I've added it to the upstream bug report.
Comment 15 Olav Vitters 2020-07-17 13:32:05 CEST
Martin: Do want the honour of building a new package or perhaps you want me to do so?

Many thanks for figuring out the cause and making a patch!
Comment 16 Martin Whitaker 2020-07-17 13:43:03 CEST
I've just built and tested a new package locally, so I'll push it now.
Comment 17 Ben McMonagle 2020-07-18 07:02:50 CEST
(In reply to Olav Vitters from comment #13)
> (In reply to ben mcmonagle from comment #12)
> > install is 32bit all DE real hardware install.
> > Toshiba Portege R930 with  ssd.
> > when the issue  occurred, I changed DM to SDDM
> 
> Did you also switch from GNOME / gnome-shell to something else? Or does SDDM
> allow you to start gnome-shell?

I was booting to Plasma using GDM.
after the update I was presented with:  :( something has gone wrong.....
used drakdm to change to SDDM, rebooted and SDDM greeter presented and allowed plasma session to run.

checking Gnome, Gnome Classic, Gnome on Xorg, & Gnome(wayland) all fail to load,
the first 3 present :( something has gone wrong....., wayland session just represents the SDDM greeter.

will wait for the new package and check again
Comment 18 Ben McMonagle 2020-07-18 07:48:42 CEST
ok, 

applied the updates

the good news. 
with SDDM, Gnome, Gnome Classic and Gnome on Xorg all present desktop.

Gnome(wayland) represents SDDM greeter still.

with GDM, both Gnome entries present desktop, as does Plasma
Comment 19 Martin Whitaker 2020-07-18 11:33:33 CEST
Good catch Ben. I missed one necessary change when creating the patch. Fixed in libmutter7_0-3.37.3-4.mga8.i586.rpm
Comment 20 Ben McMonagle 2020-07-18 23:00:00 CEST
ok,

new update applied.

GDM : plasma, Gnome, Gnome classic and Gnome xorg all present desktop from greeter.

SDDM: plasma, Gnome, Gnome classic, Gnome xorg & Gnome wayland all present desktop from greeter.

works for me

thanks Martin
Comment 21 Olav Vitters 2020-07-22 16:10:10 CEST
Seems upstream is overwhelmed or something. Minutes after pinging on #gnome-shell on irc.gnome.org (though I used Matrix/Element app) 2 people responded in the upstream bug/issue. They've requested a merge request, plus a small change.

I can ping again to get them to notice if wanted.
Comment 22 Martin Whitaker 2020-07-25 11:29:31 CEST
My patch has been merged upstream, so I think we can close this now.

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.