Bug 23477

Summary: in the package x11-driver-video-nvidia-current-390.77-1.1.mga6.nonfree files /usr/bin/nvidia* are missing
Product: Mageia Reporter: peter lawford <petlaw726>
Component: RPM PackagesAssignee: Mageia Bug Squad <bugsquad>
Status: RESOLVED INVALID QA Contact:
Severity: major    
Priority: Normal CC: tmb
Version: 6   
Target Milestone: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Source RPM: x11-driver-video-nvidia-current-390.77-1.1.mga6.nonfree CVE:
Status comment:
Attachments: return of ls -al /usr/lib64/nvidia-current/

Description peter lawford 2018-08-22 16:23:46 CEST
Description of problem:
the package x11-driver-video-nvidia-current-390.77-1.1.mga6.nonfree should contain the following files:
 [root@magaux alain4]# rpm -ql x11-driver-video-nvidia-current-390.77-1.1.mga6.nonfree |grep /usr/bin/nvidia
/usr/bin/nvidia-bug-report.sh
/usr/bin/nvidia-cuda-mps-control
/usr/bin/nvidia-cuda-mps-server
/usr/bin/nvidia-debugdump
/usr/bin/nvidia-modprobe
/usr/bin/nvidia-persistenced
/usr/bin/nvidia-settings
/usr/bin/nvidia-smi
/usr/bin/nvidia-xconfig

but all these files are missing:
[root@magaux alain4]# ls -al /usr/bin/ |grep nvidia
[root@magaux alain4]#



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
Comment 1 Thomas Backlund 2018-08-22 16:29:44 CEST
They are there with the alternatives system

$ ll /usr/bin/nvidia-smi
lrwxrwxrwx 1 root root 28 aug 21 23:40 /usr/bin/nvidia-smi -> /etc/alternatives/nvidia_smi*

$ ll /etc/alternatives/nvidia_smi
lrwxrwxrwx 1 root root 40 aug 21 23:40 /etc/alternatives/nvidia_smi -> /usr/lib64/nvidia-current/bin/nvidia-smi*

If they are not there, something has screwed up your install

Resolution: (none) => INVALID
CC: (none) => tmb
Status: NEW => RESOLVED

Comment 2 peter lawford 2018-08-22 17:08:38 CEST
(In reply to Thomas Backlund from comment #1)
> They are there with the alternatives system
> 
> $ ll /usr/bin/nvidia-smi
> lrwxrwxrwx 1 root root 28 aug 21 23:40 /usr/bin/nvidia-smi ->
> /etc/alternatives/nvidia_smi*
> 
> $ ll /etc/alternatives/nvidia_smi
> lrwxrwxrwx 1 root root 40 aug 21 23:40 /etc/alternatives/nvidia_smi ->
> /usr/lib64/nvidia-current/bin/nvidia-smi*
> 
> If they are not there, something has screwed up your install

files nvidia-* are present in /usr/lib64/nvidia-current/bin/, but links in /etc/alternatives and /usr/bin are missing, I don't know why.
It is always possible to manually create them
I think they were lost while upgrading from mageia5 to mageia6, one year ago.
I have 2 old mageia5, it is perhaps worthy to upgrade them to mga6, at least one of them
furthermore, I've just remark that numerous links targets from /etc/alternatives to /usr/<bin,sbin,share,lib> are missing in the latter directories
I don't know how to recover that
If you have an idea, thank you to provide it to me
Comment 3 peter lawford 2018-08-22 17:20:40 CEST
(In reply to Thomas Backlund from comment #1)
> They are there with the alternatives system
> 
> $ ll /usr/bin/nvidia-smi
> lrwxrwxrwx 1 root root 28 aug 21 23:40 /usr/bin/nvidia-smi ->
> /etc/alternatives/nvidia_smi*
> 
> $ ll /etc/alternatives/nvidia_smi
> lrwxrwxrwx 1 root root 40 aug 21 23:40 /etc/alternatives/nvidia_smi ->
> /usr/lib64/nvidia-current/bin/nvidia-smi*
> 
> If they are not there, something has screwed up your install

they are missing in /etc/alternatives of my old mageia5 too!
the best is to reinstall mageia6 from zero; I am impatiently waiting for the issue of mageia6.1
Comment 4 peter lawford 2018-08-22 19:16:50 CEST
sorry, but I persist to claim that the package is corrupted: I have manually created the links /usr/bin/nvidia* -> /etc/alternatives/nvidia$ -> /usr/lib64/nvidia-current/bin/nvidia*

the return of the command nvidia-smi becomes:
[root@magaux alain4]# nvidia-smi
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.

even if I attempt to execute directly the file-target:
[root@magaux alain4]# cd /usr/lib64/nvidia-current/bin/
[root@magaux bin]# ls -al
total 1404
drwxr-xr-x 2 root root   4096 août  22 16:26 ./
drwxr-xr-x 6 root root   4096 août  22 16:26 ../
-rwxr-xr-x 1 root root  26875 juil. 17 12:44 nvidia-bug-report.sh*
-rwxr-xr-x 1 root root  67896 juil. 17 12:44 nvidia-cuda-mps-control*
-rwxr-xr-x 1 root root  47072 juil. 17 12:44 nvidia-cuda-mps-server*
-rwxr-xr-x 1 root root 227824 juil. 17 12:44 nvidia-debugdump*
-rwsr-xr-x 1 root root  31280 juil. 17 12:44 nvidia-modprobe*
-rwxr-xr-x 1 root root  43784 juil. 17 12:44 nvidia-persistenced*
-rwxr-xr-x 1 root root 277888 juil. 17 12:44 nvidia-settings*
-rwxr-xr-x 1 root root 514080 juil. 17 12:44 nvidia-smi*
-rwxr-xr-x 1 root root 178856 juil. 17 12:44 nvidia-xconfig*
[root@magaux bin]# ./nvidia-smi
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.
although the file libnvidia-ml.so exists: see return of ls -al /usr/lib64/nvidia-current/ as attached file
Comment 5 peter lawford 2018-08-22 19:19:24 CEST
Created attachment 10332 [details]
return of ls -al /usr/lib64/nvidia-current/
Comment 6 Thomas Backlund 2018-08-22 22:25:09 CEST
(In reply to peter lawford from comment #4)
> sorry, but I persist to claim that the package is corrupted: I have manually
> created the links /usr/bin/nvidia* -> /etc/alternatives/nvidia$ ->
> /usr/lib64/nvidia-current/bin/nvidia*
>


The package is ok, I know... I'm using it :)

its just more stuff broken for you... GL path and other stuff need to be set up too...


The easiest way for you to sort this out is to use mcc to configure your hw to use the nouveau (or vesa driver), reboot and then remove all nvidia-current packages.

after that, reconfigure your your hw with mcc again to use the nvidia driver, and it should fix it all up for you, then reboot to use it
Comment 7 peter lawford 2018-09-07 14:35:12 CEST
(In reply to Thomas Backlund from comment #6)
> (In reply to peter lawford from comment #4)
> > sorry, but I persist to claim that the package is corrupted: I have manually
> > created the links /usr/bin/nvidia* -> /etc/alternatives/nvidia$ ->
> > /usr/lib64/nvidia-current/bin/nvidia*
> >
> 
> 
> The package is ok, I know... I'm using it :)
> 
> its just more stuff broken for you... GL path and other stuff need to be set
> up too...
> 
> 
> The easiest way for you to sort this out is to use mcc to configure your hw
> to use the nouveau (or vesa driver), reboot and then remove all
> nvidia-current packages.
> 
> after that, reconfigure your your hw with mcc again to use the nvidia
> driver, and it should fix it all up for you, then reboot to use it

sorry I was so long to reply to your comment
I think there are inconsistencies in the graphical server of mageia, and that for a long time (from mageia1 and even in mandriva's era
ma graphic card (an old one) is a nvidia GTX9800+ which is isuued in 2008:

[root@magaux alain4]# lspci |grep NVIDIA
03:00.0 VGA compatible controller: NVIDIA Corporation G92 [GeForce 9800 GTX / 9800 GTX+] (rev a2)

the official driver for this GPU is nvidia340
I scrupulously followed your indications: I was in mcc > hw > gs(graphical server) and clicked on "nouveau" an "OK" and I've rebooted
after the system rebooted I have removed the packages: 

dkms-nvidia-current-390.87-1.mga6.nonfree
x11-driver-video-nvidia-current-390.87-1.mga6.nonfree
nvidia-current-doc-html-390.87-1.mga6.nonfree

and went again to mcc > hw > gs and clicked on the drivers corresponding to my card, say:
Geforce 8100 to Geforce 415 (this tab was already highlighted, which seems to mean this driver is the right one)
and after the packages:

dkms-nvidia340-340.106-1.mga6.nonfree.x86_64
nvidia340-doc-html-340.106-1.mga6.nonfree.x86_64
x11-driver-video-nvidia340-340.106-1.mga6.nonfree.x86_64

had been automatically installed
and when I attempted to reboot, it was impossible: the splash turned in loop indefinitely without reaching the runlevel5
what is more strange is what is written in dmesg: (I copy only the relevant lines):

[   13.833750] nvidia-nvlink: Nvlink Core is being initialized, major device number 243
[   13.834227] NVRM: The NVIDIA GeForce 9800 GTX/9800 GTX+ GPU installed in this system is
               NVRM:  supported through the NVIDIA 340.xx Legacy drivers. Please
               NVRM:  visit http://www.nvidia.com/object/unix.html for more
               NVRM:  information.  The 390.87 NVIDIA driver will ignore
               NVRM:  this GPU.  Continuing probe...
[   13.836876] NVRM: No NVIDIA graphics adapter found!
[   13.837522] nvidia-nvlink: Unregistered the Nvlink Core, major device number 243

it says that my already installed 390.87 NVIDIA driver ignore my GPU, which would be supported by NVIDIA 340.xx, but the latter, if installed doesn't work
whereas the 390.87 NVIDIA does.
where is the error?
Comment 8 Thomas Backlund 2018-09-07 14:46:31 CEST
(In reply to peter lawford from comment #7)

> sorry I was so long to reply to your comment
> I think there are inconsistencies in the graphical server of mageia, and
> that for a long time (from mageia1 and even in mandriva's era
> ma graphic card (an old one) is a nvidia GTX9800+ which is isuued in 2008:
> 
> [root@magaux alain4]# lspci |grep NVIDIA
> 03:00.0 VGA compatible controller: NVIDIA Corporation G92 [GeForce 9800 GTX
> / 9800 GTX+] (rev a2)
> 

Ok, so you have an older GPU...

> the official driver for this GPU is nvidia340
> I scrupulously followed your indications: I was in mcc > hw > gs(graphical
> server) and clicked on "nouveau" an "OK" and I've rebooted
> after the system rebooted I have removed the packages: 
> 
> dkms-nvidia-current-390.87-1.mga6.nonfree
> x11-driver-video-nvidia-current-390.87-1.mga6.nonfree
> nvidia-current-doc-html-390.87-1.mga6.nonfree
> 
> and went again to mcc > hw > gs and clicked on the drivers corresponding to
> my card, say:
> Geforce 8100 to Geforce 415 (this tab was already highlighted, which seems
> to mean this driver is the right one)
> and after the packages:
> 
> dkms-nvidia340-340.106-1.mga6.nonfree.x86_64
> nvidia340-doc-html-340.106-1.mga6.nonfree.x86_64
> x11-driver-video-nvidia340-340.106-1.mga6.nonfree.x86_64
> 
> had been automatically installed


And drakx installed the correct driver for you...

> and when I attempted to reboot, it was impossible: the splash turned in loop
> indefinitely without reaching the runlevel5

Ok, so this might be a bug that some hits where the "nokmsboot" is not added to kernel command line, so you will have to do that yourself for now...


> what is more strange is what is written in dmesg: (I copy only the relevant
> lines):
> 
> [   13.833750] nvidia-nvlink: Nvlink Core is being initialized, major device
> number 243
> [   13.834227] NVRM: The NVIDIA GeForce 9800 GTX/9800 GTX+ GPU installed in
> this system is
>                NVRM:  supported through the NVIDIA 340.xx Legacy drivers.
> Please
>                NVRM:  visit http://www.nvidia.com/object/unix.html for more
>                NVRM:  information.  The 390.87 NVIDIA driver will ignore
>                NVRM:  this GPU.  Continuing probe...
> [   13.836876] NVRM: No NVIDIA graphics adapter found!
> [   13.837522] nvidia-nvlink: Unregistered the Nvlink Core, major device
> number 243
> 
> it says that my already installed 390.87 NVIDIA driver ignore my GPU, which
> would be supported by NVIDIA 340.xx, but the latter, if installed doesn't
> work
> whereas the 390.87 NVIDIA does.
> where is the error?

Are you sure you uninstalled all nvidia-current packages ?

whats the output of:

rpm -qa |grep -i nvidia
Comment 9 peter lawford 2018-09-09 15:28:54 CEST
(In reply to Thomas Backlund from comment #8)
> (In reply to peter lawford from comment #7)
> 
> > sorry I was so long to reply to your comment
> > I think there are inconsistencies in the graphical server of mageia, and
> > that for a long time (from mageia1 and even in mandriva's era
> > ma graphic card (an old one) is a nvidia GTX9800+ which is isuued in 2008:
> > 
> > [root@magaux alain4]# lspci |grep NVIDIA
> > 03:00.0 VGA compatible controller: NVIDIA Corporation G92 [GeForce 9800 GTX
> > / 9800 GTX+] (rev a2)
> > 
> 
> Ok, so you have an older GPU...
> 
> > the official driver for this GPU is nvidia340
> > I scrupulously followed your indications: I was in mcc > hw > gs(graphical
> > server) and clicked on "nouveau" an "OK" and I've rebooted
> > after the system rebooted I have removed the packages: 
> > 
> > dkms-nvidia-current-390.87-1.mga6.nonfree
> > x11-driver-video-nvidia-current-390.87-1.mga6.nonfree
> > nvidia-current-doc-html-390.87-1.mga6.nonfree
> > 
> > and went again to mcc > hw > gs and clicked on the drivers corresponding to
> > my card, say:
> > Geforce 8100 to Geforce 415 (this tab was already highlighted, which seems
> > to mean this driver is the right one)
> > and after the packages:
> > 
> > dkms-nvidia340-340.106-1.mga6.nonfree.x86_64
> > nvidia340-doc-html-340.106-1.mga6.nonfree.x86_64
> > x11-driver-video-nvidia340-340.106-1.mga6.nonfree.x86_64
> > 
> > had been automatically installed
> 
> 
> And drakx installed the correct driver for you...
> 
> > and when I attempted to reboot, it was impossible: the splash turned in loop
> > indefinitely without reaching the runlevel5
> 
> Ok, so this might be a bug that some hits where the "nokmsboot" is not added
> to kernel command line, so you will have to do that yourself for now...
> 
> 
> > what is more strange is what is written in dmesg: (I copy only the relevant
> > lines):
> > 
> > [   13.833750] nvidia-nvlink: Nvlink Core is being initialized, major device
> > number 243
> > [   13.834227] NVRM: The NVIDIA GeForce 9800 GTX/9800 GTX+ GPU installed in
> > this system is
> >                NVRM:  supported through the NVIDIA 340.xx Legacy drivers.
> > Please
> >                NVRM:  visit http://www.nvidia.com/object/unix.html for more
> >                NVRM:  information.  The 390.87 NVIDIA driver will ignore
> >                NVRM:  this GPU.  Continuing probe...
> > [   13.836876] NVRM: No NVIDIA graphics adapter found!
> > [   13.837522] nvidia-nvlink: Unregistered the Nvlink Core, major device
> > number 243
> > 
> > it says that my already installed 390.87 NVIDIA driver ignore my GPU, which
> > would be supported by NVIDIA 340.xx, but the latter, if installed doesn't
> > work
> > whereas the 390.87 NVIDIA does.
> > where is the error?
> 
> Are you sure you uninstalled all nvidia-current packages ?
> 
> whats the output of:
> 
> rpm -qa |grep -i nvidia

made the trial on an old mageia5: have removed all packages *nvidia*current*mga5 
and replaced them by packages *nvidia340*mga5 using mcc -> hw -> gs
with kernel option nokmsboot: could reboot
without nokmsboot: couldn't reboot