| Summary: | Repeated Xorg login failures with nvidia graphics hardware | ||
|---|---|---|---|
| Product: | Mageia | Reporter: | Len Lawrence <tarazed25> |
| Component: | Release (media or process) | Assignee: | Mageia Bug Squad <bugsquad> |
| Status: | NEEDINFO --- | QA Contact: | |
| Severity: | minor | ||
| Priority: | Normal | CC: | davidwhodgins, lewyssmith, marja11, sysadmin-bugs, tarazed25 |
| Version: | Cauldron | ||
| Target Milestone: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Source RPM: | CVE: | ||
| Status comment: | |||
| Attachments: | Journal output from test run of Mageia-9-beta2 on x86_64 box with nvidia graphics | ||
|
Description
Len Lawrence
2023-05-08 20:22:56 CEST
Thanks for this curious report. Do you think it related to the specific nVidia card GeForce GTX 1080 Ti? Please post the O/P of: $ inxi -G to clarify the graphics. You refer to "one specific machine". Do you have another with nVidia graphics and Mageia 9 in some form that does not show the problem? [which is almost the inverse of that whereby Gnome & its relatives would not login, due to a fault in a specific version of glib]. Perhaps after a fresh boot, one successful login/logout to Gnome, and a bounced login to a different desktop, attach the journal: $ journalctl -b --no-hostname | xz > journal.txt.xz CC:
(none) =>
lewyssmith This is the only nvidia machine here so I cannot comment on how specific this might be to the graphics card. Thanks for the suggestions. The problem has been aired on QA Discuss but nobody else seems to have encountered the problem. Not many nvidia users left.
$ inxi -G
Graphics:
Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] driver: nouveau v: kernel
Device-2: PCTV Systems tripleStick 292e type: USB driver: em28xx
Display: wayland server: X.Org v: 22.1.9 with: Xwayland v: 22.1.9
compositor: gnome-shell v: 44.1 driver: X: loaded: nouveau,v4l dri: nouveau
gpu: nouveau resolution: 2560x1440~60Hz
API: OpenGL v: 4.3 Mesa 23.0.3 renderer: NV132
Booted the test system. Logged in to GNOME Classic; OK.
Logged out and attempted login to Cinnamon; failed.
Logged in to GNOME Classic.
$ journalctl -b --no-hostname | xz > journal.txt.xz
Attaching the journal output.
Created attachment 13821 [details]
Journal output from test run of Mageia-9-beta2 on x86_64 box with nvidia graphicsCC:
(none) =>
tarazed25 Repeated earlier experiments with graphics drivers nouveau and modesetting.
DEs GNOME and GNOME Classic were accessible but none of the others (Cinnamon, Mate, Xfce, Plasma).
For GNOME login with modesetting:
$ inxi -G
Graphics:
Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] driver: nouveau v: kernel
Device-2: PCTV Systems tripleStick 292e type: USB driver: em28xx
Display: wayland server: X.Org v: 22.1.9 with: Xwayland v: 22.1.9
compositor: gnome-shell v: 44.1 driver: X: loaded: modesetting,v4l
dri: nouveau gpu: nouveau resolution: 2560x1440~60Hz
API: OpenGL v: 4.3 Mesa 23.0.3 renderer: NV132
Looking at comment 2 I have to wonder where is the proprietary driver? Tried reinstalling it via drakx11 and ended up with a system which refused ALL logins on the test partition. I had not seen the usual message about rebuilding the nvidia driver after reboot. Next removed dkms-nvidia-current and reinstalled it and rebooted. Ended up with a blank screen with a blinking cursor. Raised a console and ran startx - nothing doing - there was a problem with xauth. At this point a reinstallation or upgrade is in order but need to capture the journal. In reply to Dave Hodgins, comment 5: No, the permissions are 0600. The Xorg.0.log file says module nvidia cannot be found. Removed /etc/X11/xorg.conf and tried again. This time the login to GNOME Classic succeeded. User's .Xauthority contains the MIT-MAGIC-COOKIE. # updatedb # locate nvidia |grep -v lib|grep -v /home|grep -v share|grep -v src| less /etc/dracut.conf.d/99-nvidia.conf /usr/bin/nvidia-modprobe /usr/bin/nvidia-ngx-updater /usr/bin/nvidia-persistenced /usr/bin/nvidia-smi $ inxi -G Graphics: Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] driver: nouveau v: kernel Device-2: PCTV Systems tripleStick 292e type: USB driver: em28xx Display: wayland server: X.org v: 1.21.1.8 with: Xwayland v: 22.1.9 compositor: gnome-shell v: 44.1 driver: X: loaded: v4l gpu: nouveau resolution: 2560x1440 API: OpenGL Message: No GL data found on this system. Continuing from comment 7. No xorg.conf Logged in to GNOME (Wayland) and GNOME Classic. Failed for IceWM and Cinnamon Software Rendering, Back to GNOME Classic. # nvidia-settings ERROR: NVIDIA driver is not loaded Segmentation fault (core dumped) Ran drakx11. Enabled Translucency, RENDER acceleration, Force display mode of DVI Forcing DVI enforces the use of the frame buffer (potential slowdown?). Looks like a bad idea; drakx11 stalls waiting for the udev queue to empty. Went back and undid the framebuffer connection. Tried to complete and hit "Fatal server error - try to change some parameters" This is the output from the stalled drakx11:
Too late to run INIT block at /usr/lib64/perl5/vendor_perl/Glib/Object/Introspection.pm line 257.
Ignore the following Glib::Object::Introspection & Gtk3 warnings
Subroutine Gtk3::main redefined at /usr/share/perl5/vendor_perl/Gtk3.pm line 539.
Timed out for waiting the udev queue being empty.
Crashed out at that point. nvidia-settings crashes with a core dump.
Repeated drakx11 and accepted default options.
xorg.conf contains
Section "Device"
Identifier "device1"
VendorName "NVIDIA Corporation"
BoardName "NVIDIA GeForce 745 series and later"
Driver "nvidia"
Option "DPMS"
Option "DynamicTwinView" "false"
Option "AddARGBGLXVisuals"
EndSection
Reboot.
Continuing from comment 9. Could not login at all. Removed xorg.conf and performed a cold boot. Back to GNOME and GNOME Classic. During the reboot there were several error messages including the string "maybe USB cable is bad". No xorg.conf and no rebuilding of the nvidia kmod during boot. Installed dkms-nvidia-current and rebooted. No USB cable error messages this time and no sign of the nvidia kmod being built. `inxi -G` reports that the Xorg nouveau driver is being used. No xorg.conf file in /etc/X11. Switched to another partition with a GNOME system already installed (from a Live session). That appeared to be running nouveau so I tried upgrading using the classic iso and installing the nvidia driver. No errors but it did not appear to take because this was the result:
$ inxi -G
Graphics:
Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] driver: nouveau v: kernel
Device-2: PCTV Systems tripleStick 292e type: USB driver: em28xx
Display: x11 server: X.org v: 1.21.1.8 with: Xwayland v: 22.1.9 driver: X:
loaded: modesetting,v4l dri: nouveau gpu: nouveau resolution: 2560x1440~60Hz
API: OpenGL v: 4.3 Mesa 23.0.3 renderer: NV132
GNOME was functioning OK and it was possible to install Mate (task-mate-minimal and task-mate) and then to login to Mate.
About to test successive logins....
So far so good: Gnome Classic on Wayland IceWM session GNOME on Xorg xorg.conf identifies the driver as modesetting with option DPMS. Going back to Mate. That worked. One major issue that emerges from all this is that the nvidia-current kernel module does not get built. $ echo $XDG_SESSION_TYPE x11 $ echo $XAUTHORITY /home/lcl/.Xauthority $ locate nvidia |grep -v lib|grep -v /home|grep -v share|grep -v src|grep -v /data /etc/nvidia-current /etc/OpenCL/vendors/nvidia.icd /etc/X11/xorg.conf.d/10-nvidia.conf /etc/dracut.conf.d/99-nvidia.conf /etc/nvidia-current/ld.so.conf /etc/nvidia-current/modprobe.conf /etc/nvidia-current/nvidia-settings.xinit /etc/vulkan/icd.d/nvidia_icd.json /etc/vulkan/implicit_layer.d/nvidia_layers.json /usr/bin/nvidia-bug-report.sh /usr/bin/nvidia-cuda-mps-control /usr/bin/nvidia-cuda-mps-server /usr/bin/nvidia-debugdump /usr/bin/nvidia-modprobe /usr/bin/nvidia-ngx-updater /usr/bin/nvidia-persistenced /usr/bin/nvidia-powerd /usr/bin/nvidia-settings /usr/bin/nvidia-sleep.sh /usr/bin/nvidia-smi /usr/bin/nvidia-xconfig $ ls /etc/nvidia-current ld.so.conf modprobe.conf nvidia-settings.xinit What's the output of "dkms status"? # dkms status nvidia-current, 525.116.03-1.mga9.nonfree, 6.3.1-desktop-1.mga9, x86_64: installed That was after having tried to reinstall nvidia: $ drakx11 Too late to run INIT block at /usr/lib64/perl5/vendor_perl/Glib/Object/Introspection.pm line 257. Ignore the following Glib::Object::Introspection & Gtk3 warnings Subroutine Gtk3::main redefined at /usr/share/perl5/vendor_perl/Gtk3.pm line 539. Timed out for waiting the udev queue being empty. Timed out for waiting the udev queue being empty. Timed out for waiting the udev queue being empty. Timed out for waiting the udev queue being empty. ^C As root: # drakx11 Too late to run INIT block at /usr/lib64/perl5/vendor_perl/Glib/Object/Introspection.pm line 257. Ignore the following Glib::Object::Introspection & Gtk3 warnings Subroutine Gtk3::main redefined at /usr/share/perl5/vendor_perl/Gtk3.pm line 539. (drakx11:447041): IBUS-WARNING **: 08:09:19.080: The owner of /home/lcl/.config/ibus/bus is not root! Timed out for waiting the udev queue being empty. ^C The proprietary driver was chosen to cover the range "GeForce 745 and later" to match GeForce GTX 1080 Ti. Tried root again but switching to root in the proper fashion (su -) and the IBUS warning disappeared. Otherwise the same result needing ^C to exit. The situation seems to be going from bad to worse. logout failed. In the end had to crash out with CtrlAltDel and login to a Mageia 8 partition, which took longer than expected. Going to hose the bad mga9 partition, update the bootloader and reinstall on the blank partition. Cleaned the partition and formatted it looking for bad blocks. Tried to install all desktops except GNOME. Seemed to work but the installation eventually appeared to be hung. Hammered the cancel key until something happened. The installation restarted and was soon finished with a free graphics driver installed. It rebooted with nouveau. Selected Cinnamon, Xfce, Mate, Plasma (X11), IceWM, LXDE. All logins successful. In LXDE firefox produced "Invalid menu entry" and aborted but the browser could be launched from the cli. Back to Mate. One oddity is that the splash screen is stuck with one of the standard Xfce backgrounds but Mageia 9 Plymouth appears intermittently. Next step is to install the GNOME DE, if possible. On second thoughts - try the nvidia driver first. Lots of dracut modules could not be installed, such as squash, memstrack, nvmf, iscsi... because associated commands could not be found and others like "dracut: dracut module 'ifcfg' depends on 'network', which can't be installed" # dkms status nvidia-current, 525.116.03-1.mga9.nonfree, 6.3.1-desktop-1.mga9, x86_64: installed After reboot, in Mate
$ inxi -G
Graphics:
Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] driver: nvidia v: 525.116.03
Device-2: PCTV Systems tripleStick 292e type: USB driver: em28xx
Display: x11 server: X.org v: 1.21.1.8 with: Xwayland v: 22.1.9 driver: X:
loaded: nvidia,v4l gpu: nvidia resolution: 2560x1440~60Hz
API: OpenGL v: 4.6.0 NVIDIA 525.116.03 renderer: NVIDIA GeForce GTX 1080
Ti/PCIe/SSE2
So, it can be done but we have no explanation about what went wrong before.
Switching to Plasma - working fine.
Cinnamon software rendering works also.
Installed GNOME.
Logged in to GNOME Classic on Xorg : OK
Plain GNOME : OK
Was expecting a Wayland session but XDG_SESSION_TYPE=x11
GNOME on Xorg : OK
There must be something else to be installed to get Wayland on board.
Logged in to Mate.
So the login process works fine in an all Xorg environment.
Thanks for all your extra tests, Len. There was/is one fundamental question which got overlooked: - What display manager? - And if you changed it to another? Chasing display drivers may have been barking up the wrong tree. We know that if a multi-desktop installation includes Gnome, GDM is the default display manager. Which should only use Wayland for Gnome itself, X11 for the others. [I have found that sometimes other display managers use X11 even for Gnome...] I find an easy way to check, once in a session, is: $ env | grep -Ei 'wayland|x11' There were three that I can remember but it tended to be GDM for systems containing GNOME. When GNOME was left out Plasma attracted SDDM. There was a third one at some point - in the latest round of tests I used whatever was offered. The third is the simple one with a pale blue background - is that XDM? I checked XDG_SESSION at random, not in a regular fashion and did not note the results. Looking back at comment 22 treating the system where nvidia was successfully installed, as far as I can remember there was no Wayland version offered although the Xwayland driver was loaded. That probably makes sense because GNOME was added after the first reboot, i.e. on an Xorg system. Plasma Wayland was also missing - same thing. From what other users say this bug does not reproduce on other nvidia hardware of similar age. It will be EOL as far as nvidia is concerned fairly soon so maybe the incompatability with GNOME Wayland should simply be accepted and the bug closed as WONTFIX. Is this bug still present in Cauldron? It is over a year later and most of Cauldron has changed. Status:
NEW =>
NEEDINFO |