Bug 31994 - Virtual tty terminals are black when using nvidia proprietary driver
Summary: Virtual tty terminals are black when using nvidia proprietary driver
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal major
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords: IN_ERRATA8, IN_ERRATA9
Depends on: 32623 32628
Blocks:
  Show dependency treegraph
 
Reported: 2023-06-08 00:01 CEST by Morgan Leijström
Modified: 2023-12-26 00:30 CET (History)
3 users (show)

See Also:
Source RPM: kernel
CVE:
Status comment:


Attachments

Description Morgan Leijström 2023-06-08 00:01:03 CEST
Description of problem:
 Virtual terminals are virtually unusable - they work but are not displayed
 Screen is just black
 Monitor stay on, so there is some signal

Version & how reproducible:
 I noticed it already on mga8 with nvidia470, but did not care to report it then
 Now I see it with Mageia 9 and nvidia-current (R525)
 Graphics card is GTX750, screen is 4K

 When I instead use Xorg modesetting driver, the tty works as intended.

Steps to Reproduce:
 Use nvidia proprietary driver
 Boot to desktop
 Ctrl-Alt-F3 -> Screen is black
 Type anyway your username <enter> password  (still just black)
 Ctrl-Alt-F1 (F2 if you use Gnome) -> back to desktop
 Check journal, and you see you did log in in that terminal
Comment 1 Morgan Leijström 2023-06-08 00:39:18 CEST
BTW, DE=Plasma, DM=LightDM

From QA-list, Martin W:
> If you have a vga= option on your boot command line, try changing it to vga=text or removing it altogether. Does that help?

If i edit it to vga=text and try to start, grub informs me 
(messages was in Swedish, translated it here)
-----------------
  Start a command list
Reading in Linux 6.3.6-desktop-1.mga9...
error: ../../grub-core/kern/misc.c:500:unknown number.
Reading in initial ramdisk...
error: ../../grub-core/loader/i386/pc/linux.c:422:you need to read in the kernel first
-----------------
And then return to edit mode


If I remove the vga entry it boots normally

I tried a bit more:
This time i tried several terminals tty3, 4, 6, back to desktop, back to a tty. 4 and 6 worked once.  But mostly they did not.
Worst of all is that suddenly desktop also was all black except a frozen mouse cursor.  And it dit not respond to ctrl-alt-backspace, ctrl-alt-backspace-backspace, ctrl-alt-del-del, and not even REISUB !  Had to press reset.
Not fun. This is my production system...
Comment 2 Giuseppe Ghibò 2023-06-08 01:09:01 CEST
Apparently this seems a behaviour also happening on other distros and also on other OSes (such as MacOS at least up to when it was supporting NVidia cards).

Suggestions are from adding 'nomodeset' to booting cmdline, to use the vga=<res> is the code of the vga resolution to be the SAME as the final desktop resolution, to add GRUB_GFXMODE=<w>x<h> resolution in /etc/default/grub.

Also about something related to kernel regressions (e.g.: https://forums.developer.nvidia.com/t/framebuffer-output-stops-since-linux-kernel-6/244467), but not sure if obsolete. A simple test could be to try an older kernel than 6.x, e.g. a mga8 kernel (but it requires a couple of extra packages from mga8).

CC: (none) => ghibomgx

Comment 3 Brian Rockwell 2023-06-08 01:37:16 CEST
Morgan is the <ctl><alt>f2 type terminal or are you talking Konsole/terminal from the menu.  

I did see this a bit using terminal from gnome.  I had an easy fix, I changed the colors and that resolved it.

If <c><a>f2/f3...etc. it isn't an issue on my ryzen/nvidia 1050 machine.  

$ nvidia-smi
Wed Jun  7 18:35:50 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.116.04   Driver Version: 525.116.04   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+

What resolution is your monitor?

CC: (none) => brtians1

Comment 4 Brian Rockwell 2023-06-08 01:44:39 CEST
730GT running nouveau

I'm not seeing the issue using nouveau (as you noted, running proprietary nvidia and I'm not).

on this box, I do see an occasional freeze, but I chock that up to certain special effects and Nouveau and the box being quite aged (hardware).
Comment 5 Morgan Leijström 2023-06-08 08:26:42 CEST
Ctrl-Alt-F2 etc

Thank you Guiseppe, will try later whan I figured out what:

Ahhh this is so convoluted...

I noticed the problem appeared in mga8 with nvidia470, but i believe it worked in initial mga8.  Maybe i should try a mga8 live just to see... It is then nvidia460 and elder kernel.

As it is confirmed also on other distros, best is to scrape known workarounds.
I find it strange I am the first bug about this on Mageia.

How do I find the value for vga=<res>?
My screen is "4K UHD" 3840x2160

----

This is a legacy BIOS system, so OK.
I note that for others to test, if it is a UEFI booting system, the variable gfxpayload should be set to desired resolution in grub.cfg 

https://stackoverflow.com/questions/47442204/how-does-linux-kernel-parse-vga-parameter

https://www.gnu.org/software/grub/manual/grub/grub.html#gfxmode

----

Or should I use video= ?
At https://unix.stackexchange.com/questions/92817/what-happened-to-vga-ask-in-newer-kernels i see

KMS-enabled kernels overrule any vga= setting before init completes, when modesetting is initiated, functionally making whether vga=ask works or not moot.

Instead, use video= with the specific mode desired on the vttys. With video=, you're not limited to VESA modes - any mode supported by the display can be used.

----

(In reply to Brian Rockwell from comment #4)
> 730GT running nouveau

Could you try with nvidia470?

> I'm not seeing the issue using nouveau (as you noted, running proprietary
> nvidia and I'm not).
> 
> on this box, I do see an occasional freeze, but I chock that up to certain
> special effects and Nouveau and the box being quite aged (hardware).

For this problem, have you tried if Xorg modesetting works better?

----

For errata: Make subchapter in https://wiki.mageia.org/en/Setup_the_graphical_server#Nvidia_proprietary_drivers describing the problem and workarounds, and point there from errata.

Keywords: (none) => FOR_ERRATA9

Comment 6 Morgan Leijström 2023-06-08 08:34:40 CEST
"Unknown command line parameters"??

$ sudo journalctl -b|grep 'command line'

jun 08 00:19:12 svarten.tribun kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-6.3.6-desktop-1.mga9 root=/dev/mapper/vg--mga-lv_root ro noiswmd nokmsboot resume=/dev/vg-mga/lv_swap audit=0 vga=797

jun 08 00:19:12 svarten.tribun kernel: Unknown kernel command line parameters "noiswmd nokmsboot BOOT_IMAGE=/vmlinuz-6.3.6-desktop-1.mga9 vga=797", will be passed to user space.
Comment 7 Dave Hodgins 2023-06-08 18:42:08 CEST
(In reply to Morgan Leijström from comment #6)
> "Unknown command line parameters"??

The unknown (to the kernel) parameters message just means those parameters
were not recognized by the kernel for it's own use. They are still available
to be processed by systemd, it's units, by initd startup scripts in
/etc/rc.d/init.d/, or any program that parses /proc/cmdline.

Install the kernel-doc package if it's not already installed.
See the output of
grep -r -e 'vga=' -e 'video=' /usr/share/doc/kernel-doc/*
for which files have information on those parameters.

3840x2160 is for graphics mode. You'll likely want a lower resolution for
the text mode terminals.

CC: (none) => davidwhodgins

Comment 8 Giuseppe Ghibò 2023-06-08 22:42:52 CEST
(In reply to Morgan Leijström from comment #5)

> Ctrl-Alt-F2 etc
> 
> Thank you Guiseppe, will try later whan I figured out what:
> 
> Ahhh this is so convoluted...
> 
> I noticed the problem appeared in mga8 with nvidia470, but i believe it
> worked in initial mga8.  Maybe i should try a mga8 live just to see... It is
> then nvidia460 and elder kernel.
> 
> As it is confirmed also on other distros, best is to scrape known
> workarounds.
> I find it strange I am the first bug about this on Mageia.
> 
> How do I find the value for vga=<res>?
> My screen is "4K UHD" 3840x2160
> 

vga=<code> where <code> corresponds to resolution according to some table. E.g. here: https://en.wikipedia.org/wiki/VESA_BIOS_Extensions

e.g. vga=791 corresponds to 1024x768. I'm not seeing vga code for 3840x2160.

Maybe the problem is just the newer HiDPI resolution. At some point maybe you jumped to 3840x2160 and it started showing the problems. What happens if you just test everything in fullhd instead of 4k?


> ----
> 
> This is a legacy BIOS system, so OK.
> I note that for others to test, if it is a UEFI booting system, the variable
> gfxpayload should be set to desired resolution in grub.cfg 
> 
> https://stackoverflow.com/questions/47442204/how-does-linux-kernel-parse-vga-
> parameter
> 
> https://www.gnu.org/software/grub/manual/grub/grub.html#gfxmode
> 

Did it changed anything?
Comment 9 Morgan Leijström 2023-06-09 16:02:47 CEST
No time to play...

For now, entered in https://wiki.mageia.org/en/Mageia_9_Errata#Nvidia

Keywords: FOR_ERRATA9 => IN_ERRATA9

Comment 10 Morgan Leijström 2023-06-09 22:16:57 CEST
Just checked: Booting Mageia 8 Live Plasma from USB, choosing nvidia proprietary drivers, I see the same problem there

https://wiki.mageia.org/en/Mageia_8_Errata#Nvidia

Keywords: (none) => IN_ERRATA8

Comment 11 Morgan Leijström 2023-06-11 14:28:04 CEST
Strange, on same system I checked a fresh USB Live mga8 Plasma:
 the tty terminals work OK.

kernel: 5.10.16-desktop-1.mga8
nvidia-current: 460.39

I could not test other kernels due to
Bug 32006 - Mga8 Live kernel selection is broken

For errata 8: tell what versions brake it
(and working workarounds when we find -> errata 8 & 9)

Keywords: IN_ERRATA8 => FOR_ERRATA8

Morgan Leijström 2023-06-18 18:06:01 CEST

Status comment: (none) => Todo: In errata suggest workarounds

Comment 12 Morgan Leijström 2023-06-19 11:49:21 CEST
Testing with new *nvidia R535* now, back on my workstation, GTX750, 
kernel 6.3.8-desktop-1, Plasma.
No change.  It is like for earlier versions a bit random:  A few details:

*sometimes* when switching to tty, there is the correct login prompt for a couple seconds seconds - and then the screen go black again. There is a signal, because the monitor do not show the popup for lost signal like it do when suspending.

But most often it never show text- always black. 

It seem to work better after suspend-resume; I could now log in and issue one command before it went black.

Switching back to desktop (ctrl-alt-F1), there is a brief popup saying desktop effe4cts needed to restart because of graphics reset. *)
When it works.
Sometimes it just hangs with non moveable mouse pointer on black screen, and it do not even respond to REISUB, have to cut power.

Pity it hard hangs, nothing in log before hang

*) For the graphics reset:
plasmashell[232431]: Crash Annotation GraphicsCriticalError: |[0][GFX1-]: GFX: RenderThread detected a device reset in PostUpdate (t=58099) [GFX1-]: GFX: RenderThread detected a device reset in PostUpdate


Setting back to IN_ERRATA; note is there, linking here.

If we have good workarounds, please note them in Errata 8 & 9.

Keywords: FOR_ERRATA8 => IN_ERRATA8

Comment 13 Morgan Leijström 2023-06-19 14:06:01 CEST
FOUND A WORKAROUND

I noticed that our installer had set vga=797 which is by the table at
 https://en.wikipedia.org/wiki/VESA_BIOS_Extensions#Linux_video_mode_numbers
not a fully supported mode.

I changed it to 794, which is selected by our tool when we select in 
 MCC > boot > Set up boot system > Next > Advanced > Video mode 1280x1024 16bpp

And now it works!

Also it is clear that driver did not understand the previous value, as now the actual text is smaller than before (when using Nvidia driver. (nouveau instead seemed to default to high resolution)

So i guess the problem is that our tool set a vga= number that is not fully supported.  Happened to be supported in earlier drivers, but not the current ones.
Comment 14 Morgan Leijström 2023-06-19 22:49:35 CEST
Setting severity to critical as user may loose work, because:

shifting back and forth between tty and desktop may lead to
*  hard system lockup! *

I have note about lockup to erratas, and link to the workaround from comment 13, which I entered in https://wiki.mageia.org/en/Setup_the_graphical_server#Known_Nvidia_issues

Severity: major => critical
Status comment: Todo: In errata suggest workarounds => Workaround in wiki, not known if it always works

Comment 15 Thomas Backlund 2023-06-19 22:51:47 CEST
nothing critical about it... 
the only ones that can fix nvidia crap are nvidia, and they dont really care...

Severity: critical => major

Comment 16 Morgan Leijström 2023-06-19 23:03:52 CEST
Users loosing work usually is considered critical, right?

Neither that user nor I care who fouled it up.

We can surely set this as wontfix because we probably can not, but the issue itself definately is critical.


To adapt to the reality that we actively choose to support nvidia, maybe the installer should be adjusted to this our own choice, so I opened

Bug 32028 - Kernel command line vga= setting to be reconsidered.
Comment 17 Morgan Leijström 2023-12-14 11:53:03 CET
Getting fixed by kernel 6.5.13

Noting so in Errata 9, leaving Errata 8 untouched.

Depends on: (none) => 32623
Status comment: Workaround in wiki, not known if it always works => (none)
Source RPM: nvidia-current => kernel

Morgan Leijström 2023-12-15 16:59:48 CET

Depends on: (none) => 32628

Comment 18 Morgan Leijström 2023-12-26 00:30:11 CET
Kernels with the fix are now released.

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.