Bug 26825

Summary: Kernel hangs as soon as nvidia module touched
Product: Mageia Reporter: Juan Magallón <waldergeist>
Component: RPM PackagesAssignee: Kernel and Drivers maintainers <kernel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: Normal CC: ghibomgx, mageia
Version: Cauldron   
Target Milestone: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Source RPM: nvidia390-390.132-8.mga8.nonfree.src.rpm CVE:
Status comment:

Description Juan Magallón 2020-06-19 12:48:54 CEST
Current nvidia390 hangs kernel as soon as anything tries to use it.
Booted into text mode, with module loaded, just an 'nvidia-smi' causes kernel to hang.
It looks like some hunks of this patch:

https://aur.archlinux.org/cgit/aur.git/tree/kernel-5.7.patch?h=nvidia-390xx

are still missing in Cauldron package, all the pgtable-related stuff.

I modified sources by hand (the patch does not apply cleanly, some parts seem
already in), and now system boots and runs.
Can you please take a look ?

TIA
Comment 1 Lewis Smith 2020-06-19 21:41:02 CEST
Thank you for this helpful report.
I am surprised that the M8 pre-release ISO testing has not shown this; perhaps no-one uses this exact driver.
Can you post just the "VGA compatible controller" section of:
 $ lspci -v
output, to provide a context.

Assigning to the kernel/drivers team.

Assignee: bugsquad => kernel

Comment 2 Martin Whitaker 2020-06-19 23:17:35 CEST
I've added the referenced patch. We had an earlier version, which just fixed the driver build problems.

Please test nvidia390-390.132-9.mga8 when it reaches the mirrors. I've checked that the kernel module builds and loads, but don't have the necessary hardware to test whether it actually works.

CC: (none) => mageia

Comment 3 Juan Magallón 2020-06-20 01:24:01 CEST
It is an old GTX 740:

02:00.0 VGA compatible controller: NVIDIA Corporation GF100 [GeForce GTX 470] (rev a3)

leda:~/bin# nvidia-smi
Sat Jun 20 01:23:04 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.132                Driver Version: 390.132                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 470     Off  | 00000000:02:00.0 N/A |                  N/A |
| 40%   64C   P12    N/A /  N/A |     41MiB /  1218MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

I can confirm that this build works fine.
It even works in GLVND mode after removing all redundant stuff
and 32bit binaries (??):

#D=/usr/lib64/nvidia-current
D=/usr/lib64/nvidia390

rm -f $D/libGLdispatch.so.0
rm -f $D/libGL.* $D/libOpenGL.*
rm -f $D/libEGL.*
rm -f $D/libGLESv1_CM.* $D/libGLESv2.*
rm -f $D/libOpenCL.*

rm -rf /usr/lib/nvidia390

ldconfig -i
Comment 4 Martin Whitaker 2020-06-20 09:20:08 CEST
The 32-bit libraries are presumably there to allow 32-bit applications to work.

Thanks for confirming it works.

Resolution: (none) => FIXED
Status: NEW => RESOLVED

Comment 5 Giuseppe Ghibò 2020-06-20 18:26:42 CEST
I completed the 5.6 patchset for the timeval nv testing (backported from newer nvidia drivers), so that it would work also with older kernels.

CC: (none) => ghibomgx