Bug 26825 - Kernel hangs as soon as nvidia module touched
Summary: Kernel hangs as soon as nvidia module touched
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-06-19 12:48 CEST by Juan Magallón
Modified: 2020-06-20 18:26 CEST (History)
2 users (show)

See Also:
Source RPM: nvidia390-390.132-8.mga8.nonfree.src.rpm
CVE:
Status comment:


Attachments

Description Juan Magallón 2020-06-19 12:48:54 CEST
Current nvidia390 hangs kernel as soon as anything tries to use it.
Booted into text mode, with module loaded, just an 'nvidia-smi' causes kernel to hang.
It looks like some hunks of this patch:

https://aur.archlinux.org/cgit/aur.git/tree/kernel-5.7.patch?h=nvidia-390xx

are still missing in Cauldron package, all the pgtable-related stuff.

I modified sources by hand (the patch does not apply cleanly, some parts seem
already in), and now system boots and runs.
Can you please take a look ?

TIA
Comment 1 Lewis Smith 2020-06-19 21:41:02 CEST
Thank you for this helpful report.
I am surprised that the M8 pre-release ISO testing has not shown this; perhaps no-one uses this exact driver.
Can you post just the "VGA compatible controller" section of:
 $ lspci -v
output, to provide a context.

Assigning to the kernel/drivers team.

Assignee: bugsquad => kernel

Comment 2 Martin Whitaker 2020-06-19 23:17:35 CEST
I've added the referenced patch. We had an earlier version, which just fixed the driver build problems.

Please test nvidia390-390.132-9.mga8 when it reaches the mirrors. I've checked that the kernel module builds and loads, but don't have the necessary hardware to test whether it actually works.

CC: (none) => mageia

Comment 3 Juan Magallón 2020-06-20 01:24:01 CEST
It is an old GTX 740:

02:00.0 VGA compatible controller: NVIDIA Corporation GF100 [GeForce GTX 470] (rev a3)

leda:~/bin# nvidia-smi
Sat Jun 20 01:23:04 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.132                Driver Version: 390.132                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 470     Off  | 00000000:02:00.0 N/A |                  N/A |
| 40%   64C   P12    N/A /  N/A |     41MiB /  1218MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

I can confirm that this build works fine.
It even works in GLVND mode after removing all redundant stuff
and 32bit binaries (??):

#D=/usr/lib64/nvidia-current
D=/usr/lib64/nvidia390

rm -f $D/libGLdispatch.so.0
rm -f $D/libGL.* $D/libOpenGL.*
rm -f $D/libEGL.*
rm -f $D/libGLESv1_CM.* $D/libGLESv2.*
rm -f $D/libOpenCL.*

rm -rf /usr/lib/nvidia390

ldconfig -i
Comment 4 Martin Whitaker 2020-06-20 09:20:08 CEST
The 32-bit libraries are presumably there to allow 32-bit applications to work.

Thanks for confirming it works.

Resolution: (none) => FIXED
Status: NEW => RESOLVED

Comment 5 Giuseppe Ghibò 2020-06-20 18:26:42 CEST
I completed the 5.6 patchset for the timeval nv testing (backported from newer nvidia drivers), so that it would work also with older kernels.

CC: (none) => ghibomgx


Note You need to log in before you can comment on or make changes to this bug.