Bug 28124 - Starting drakx11 in GNOME causes gnome-session to die and return to GDM login screen
Summary: Starting drakx11 in GNOME causes gnome-session to die and return to GDM login...
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: release_blocker major
Target Milestone: ---
Assignee: Martin Whitaker
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-16 23:03 CET by Martin Whitaker
Modified: 2021-01-28 21:47 CET (History)
2 users (show)

See Also:
Source RPM: monitor-edid-3.3-1.mga8, libx86-1.1-37.mga8.src.rpm
CVE:
Status comment: fixed in git / svn


Attachments

Description Martin Whitaker 2021-01-16 23:03:49 CET
This occurs both when running drakx11 from the MCC GUI and when running it directly from the command line, using either GNOME on Wayland or GNOME on Xorg.

The fault occurs when drakx11 runs monitor-edid to probe the attached displays. monitor-edid runs the monitor-get-edid-using-vbe sub-program. That sub-program  first tries to read the VBE information whilst staying in the current VT (the one displaying the desktop). If that fails, it then switches to VT1 and tries again, before switching back to the desktop VT.

It appears that when you switch away from the VT displaying the GNOME desktop, gnome-session suspends itself, and doesn't reliably recover when you switch back. There must be a way to make that work, as the Ctrl-Alt-Fn keys work OK, but I don't know how.

It should be noted that monitor-get-edid-using-vbe is broken for other DMs, because it expects VT1 to be a text console, and we have long ago switched to using VT1 for the graphical display. monitor-get-edid-using-vbe does check that VT1 isn't the current VT, and fails gracefully if it is, but it still isn't doing what it was designed to do.

GDM is different in that VT1 is reserved for the GDM login screen, and user sessions are started on a different VT.

In any case, monitor-get-edid-using-vbe doesn't currently work because it uses libx86. libx86 tries to map BIOS memory into the virtual address space with exec permission. Since at least Mageia 7, if not before, /dev is mounted without exec permission, so this always fails, leading to the error message:

  mmap /dev/zero: Operation not permitted
  VBE: could not initialize LRMI

(this problem also prevents vbetool from working).

You can get monitor-get-edid-using-vbe and vbetool to work, either by remounting /dev with exec permission or by modifying libx64 to map the memory without exec permission, but at least on the hardware I tested it on, that leads to the keyboard stopping working some seconds after running those tools, and eventually the machine freezing and needing a hard reboot. This is on a UEFI machine - my suspicion is that libx86 is not compatible with UEFI BIOSs. libx86 upstream has vanished, so there's nowhere to ask questions about that.
Comment 1 Lewis Smith 2021-01-17 10:47:55 CET
Thank you for this detailed report, Martin.

Neither SRPM has a regular maintainer, so have to assign this globally.

Assignee: bugsquad => pkg-bugs
Source RPM: monitor-edid-3.3-1.mga8 => monitor-edid-3.3-1.mga8, libx86-1.1-37.mga8.src.rpm

Comment 2 Thomas Backlund 2021-01-17 12:23:00 CET
is this on real hw or virtualbox ?

interestingly I cant get it to crash here on wayland

I do get the:
monitor-get-edid-using-vbe triggered

  mmap /dev/zero: Operation not permitted

but it does not hit the   "VBE: could not initialize LRMI" path...

but then again monitor-edid actually gets edid output as it should...

do you get any edid output at all ? or does this only trigger on hw that does not output edid data ?
Comment 3 Martin Whitaker 2021-01-17 13:33:20 CET
(In reply to Thomas Backlund from comment #2)
> is this on real hw or virtualbox ?

Both

> interestingly I cant get it to crash here on wayland

Try

  monitor-get-edid-using-vbe -v

for the "VBE: could not initialize LRMI" then

  monitor-get-edid-using-vbe -v --try-in-console

to lose your gnome-session. If you want to try for the system freeze, do

  mount -o remount,exec /dev

first. For me it could take anywhere from a few seconds to a minute for the system to freeze (so I didn't immediately realise what was causing it).

> do you get any edid output at all ? or does this only trigger on hw that
> does not output edid data ?

Using monitor-edid it triggers if either

  - it doesn't find the edid info in /sys/class/drm/card*
  - you run it as root and don't specify the --first option
Comment 4 Thomas Backlund 2021-01-18 12:45:00 CET
(In reply to Martin Whitaker from comment #3)
> (In reply to Thomas Backlund from comment #2)
> > is this on real hw or virtualbox ?
> 
> Both
> 
> > interestingly I cant get it to crash here on wayland
> 
> Try
> 
>   monitor-get-edid-using-vbe -v
> 
> for the "VBE: could not initialize LRMI" then
> 
>   monitor-get-edid-using-vbe -v --try-in-console
> 
> to lose your gnome-session. If you want to try for the system freeze, do

I dont lose it.
I get switched to gdm login, but can switch  back to my open session with Ctrl-Alt-F2

> 
>   mount -o remount,exec /dev
>

Not here. it still works on real hw.


I only get:
#   monitor-get-edid-using-vbe -v --try-in-console
VBE: Error (0x4f00): 0x4f00
VBE: Error (0x4f00): 0x4f00


but nothing hangs/freezes/...
 
> first. For me it could take anywhere from a few seconds to a minute for the
> system to freeze (so I didn't immediately realise what was causing it).
> 
> > do you get any edid output at all ? or does this only trigger on hw that
> > does not output edid data ?
> 
> Using monitor-edid it triggers if either
> 
>   - it doesn't find the edid info in /sys/class/drm/card*
>   - you run it as root and don't specify the --first option

Not here on real hw

I will try in vbox next...
Comment 5 Martin Whitaker 2021-01-18 21:40:06 CET
Further testing shows that Ctrl-Alt-F2 will switch me back to a still-working session on one machine. On two others it takes me back to a blank screen which echoes key presses and has a movable mouse pointer (so a remnant of the original session with most of the processes dead). Ctrl-C will exit that, after which I'm able to start a new session from the GDM login screen.

The freeze only occurs on one machine. Oddly, that one does return sensible VBE information first. On a second machine the VBE requests time out, but have no observable side effects. On a third machine the VBE requests return sensible information and have no observable side effects.

So the freeze is probably due to a buggy BIOS.
Comment 6 Ben McMonagle 2021-01-22 20:31:32 CET
Asus u31s is affected model.

dual graphics : intel 810 or later / nvidia390 drivers required

CC: (none) => westel

Comment 7 Ben McMonagle 2021-01-22 20:36:08 CET
failure was triggered from MCC "set up the graphical server" 

never from terminal
Comment 8 Lewis Smith 2021-01-23 20:26:49 CET
To confuse things further...
Running on real hardware:
 $ inxi -SGxx
System:
  Host: localhost Kernel: 5.10.8-desktop-1.mga8 x86_64 bits: 64 
  compiler: gcc v: 10.2.1 Desktop: GNOME 3.38.3 tk: GTK 3.24.24 
  wm: gnome-shell dm: GDM, LightDM, LXDM Distro: Mageia 8 mga8 
Graphics:
  Device-1: AMD Wrestler [Radeon HD 7310] vendor: Acer Incorporated ALI 
  driver: radeon v: kernel bus ID: 00:01.0 chip ID: 1002:9809 
  Display: wayland server: Mageia X.org 1.20.10 compositor: gnome-shell 
  driver: ati,radeon,v4l resolution: 1366x768~60Hz s-dpi: 96 
  OpenGL: 
  renderer: AMD PALM (DRM 2.50.0 / 5.10.8-desktop-1.mga8 LLVM 11.0.1) 
  v: 3.3 Mesa 20.3.3 compat-v: 3.1 direct render: Yes 

I had previously run Gnome (GNOME menu entry, implies Wayland) via LightDM, and nothing failed except the old:
 $ sudo monitor-get-edid-using-vbe
 mmap /dev/zero: Operation not permitted
Ran drakx11 both via MCC and command line, all OK 'Test' included. Then I discovered that it was running X11...

Changed to GDM, re-booted, chose GNOME session; confirmed this time it *is* Wayland:
 $ echo $XDG_SESSION_TYPE
 wayland
 $ echo $WAYLAND_DISPLAY
 wayland-0
drakx11 1/2 worked, either from MCC or command line. It runs OK, but the 'Test'
showed a black screen. Fortunately that returned to the GUI after its timeout, no breakages.

CC: (none) => lewyssmith

Comment 9 Martin Whitaker 2021-01-23 21:39:28 CET
@Lewis, when using GDM, what is the output from

  cat /sys/class/tty/tty0/active

and

  sudo monitor-get-edid-using-vbe -v --try-in-console
Comment 10 Lewis Smith 2021-01-23 22:03:31 CET
You just caught me before retiring! Your wish etc...
From Xfce-4 terminal under Gnome:

$ cat /sys/class/tty/tty0/active
tty2

$ sudo monitor-get-edid-using-vbe -v --try-in-console
[sudo] password for lewis: 
mmap /dev/zero: Operation not permitted
VBE: could not initialize LRMI
mmap /dev/zero: Operation not permitted
VBE: could not initialize LRMI

The screen went momentarily black (like the drakx11 test), then reverted to the GUI, then went all black. I did Ctrl/Alt/F1 and that gave me the GDM login screen. Giving the password, to my delight that restored the full session. More like a screenlock situation.
Comment 11 Martin Whitaker 2021-01-23 22:15:45 CET
Yes, on some machines the GNOME session goes into an inactive state when monitor-get-edid-using-vbe switches to the console and is recoverable, on others machines it is not. Sometimes, if the switch to and from the console happens quickly enough, the GNOME session doesn't notice it, and you don't hit the bug.
Comment 12 Martin Whitaker 2021-01-24 12:02:31 CET
Further tests on my machine with the keyboard/freeze problem show that it only occurs when booted in UEFI mode, not in legacy mode. The code that gets the EDID info via a BIOS call allocates a block of low memory for the returned data. My guess is that when booted in UEFI mode, it happens to hit a region of memory being used by the UEFI BIOS.

My suggestions for fixing the various issues exposed in this bug report are:

1. Don't use the --try-in-console option when calling monitor-edid from drakx tools. That fixes the issue with GDM+GNOME. Other DMs run the DE in vt1, so it is ignored anyway (plus monitor-get-edid-using-vbe hasn't been working for some years due to /dev being mounted without exec permissions).

2. Modify monitor-edid to not use monitor-get-edid-using-vbe when booted in UEFI mode (test for existence of /sys/firmware/efi).

3. Modify monitor-edid to look in /sys/class/drm/card* for EDID info before trying anything else.

4. Modify libx86 to not include exec permissions when mapping memory. I doubt either monitor-get-edid-using-vbe or vbetool really need exec permissions.
Comment 13 Lewis Smith 2021-01-24 20:18:28 CET
My box is EFI. Would not have dreamed it matters.

CC: lewyssmith => (none)

Martin Whitaker 2021-01-26 00:03:53 CET

Assignee: pkg-bugs => mageia
Status comment: (none) => fixed in git / svn

Comment 14 Dave Hodgins 2021-01-28 21:47:23 CET
Closeing as fixed.

CC: (none) => davidwhodgins
Resolution: (none) => FIXED
Status: NEW => RESOLVED


Note You need to log in before you can comment on or make changes to this bug.