Bug 22862 - CUDA does not work without x11-driver, please move some files from x11-driver-video-nvidia-current to nvidia-current-cuda-opencl
Summary: CUDA does not work without x11-driver, please move some files from x11-driver...
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-02 01:52 CEST by Juan Magallon
Modified: 2019-05-27 16:33 CEST (History)
5 users (show)

See Also:
Source RPM: nvidia-current-390.48-1.mga7.nonfree.src.rpm
CVE:
Status comment:


Attachments

Description Juan Magallon 2018-04-02 01:52:42 CEST
I have noticed that, even when CUDA packages do not require X11 nvidia driver packages to install, CUDA programs compile but do not run, with an error like:

CUDA driver version is insufficient for CUDA runtime version

This is because CUDA uses a couple libraries that currently ship with the X11 driver package, namely:

libnvidia-fatbinaryloader.so.390.48
libnvidia-ptxjitcompiler.so.390.48

They should be moved from x11-driver-video-nvidia-current to nvidia-current-cuda-opencl. I have verified that X11 runs ok without this libraries on path.

Also a couple files in nvidia-current-devel are useless:
libcuda.so, libnvidia-ptxjitcompiler.so
as are not intended to be directly linked with, but are loaded by the driver.

TIA
Comment 1 Marja Van Waes 2018-04-03 12:29:33 CEST
Assigning to the kernel & drivers maintainer group, CC'ing the registered maintainer

CC: (none) => anssi.hannula, marja11
Summary: CUDA does not work without x11-driver => CUDA does not work without x11-driver, please move some files from x11-driver-video-nvidia-current to nvidia-current-cuda-opencl
Assignee: bugsquad => kernel

Jérôme Hénin 2018-08-02 13:38:24 CEST

CC: (none) => heninj

Comment 2 Juan Magallon 2018-08-02 17:19:44 CEST
FWIW, nowadays I use this script to copy missing files from a box that
uses nvidia card as primary display (so it has x11 driver installed)
to a box that uses intel as primary display and nvidia devices just
as CUDA co-processors:

VER=$(/bin/ls libcuda.so.*.* | cut -d. -f3-4)

echo $VER

rm -f libnvidia-ptxjitcompiler.* libnvidia-fatbinaryloader.* libnvidia-ml.*
scp belly:$PWD/libnvidia-ptxjitcompiler.so.${VER} .
scp belly:$PWD/libnvidia-fatbinaryloader.so.${VER} .
scp belly:$PWD/libnvidia-ml.so.${VER} .
scp belly:/usr/bin/nvidia-smi /usr/bin

ldconfig

So I think this are the minimal files that should be moved. Probably
also nvidia-cuda-mps-control, nvidia-cuda-mps-server and
nvidia-persistenced (currently in /usr/lib64/nvidia-current/bin)
and /usr/share/doc/NVIDIA_GLX-1.0/sample/nvidia-persistenced-init.tar.bz2.
All this is for use when nvidia card is not running X11.

CC: (none) => jamagallon

Marja Van Waes 2018-08-03 16:40:18 CEST

See Also: (none) => https://bugs.mageia.org/show_bug.cgi?id=23386

Comment 3 Thomas Backlund 2019-05-26 17:58:06 CEST
Please remove any nvidia packages and manual customizations you have done.

Then install  nvidia-current-cuda-opencl-430.14-3.mga7 from nonfree updates_testing (currently building)

it will pull in nvidia-current-utils that now carries nvidia-smi and nvidia-persistenced

Does it work OOB without needing any manual configuration ?

CC: (none) => tmb

Comment 4 Juan Magallon 2019-05-26 20:13:53 CEST
Hi..

It pulls correct dependencies:
annwn:~# urpmi nvidia-current-cuda-opencl
To satisfy dependencies, the following packages are going to be installed:
  Package                        Version      Release       Arch    
(medium "Nonfree Updates Testing")
  dkms-nvidia-current            430.14       3.mga7.nonfr> x86_64  
  nvidia-current-cuda-opencl     430.14       3.mga7.nonfr> x86_64  
  nvidia-current-utils           430.14       3.mga7.nonfr> x86_64  

but:
       1/3: nvidia-current-utils  #############################################
ln: failed to create symbolic link '/home/iurt/rpmbuild/BUILDROOT/nvidia-current-430.14-3.mga7.nonfree.x86_64/usr/bin/nvidia-smi': No such file or director
ln: failed to create symbolic link '/home/iurt/rpmbuild/BUILDROOT/nvidia-current-430.14-3.mga7.nonfree.x86_64/usr/bin/nvidia-persistenced': No such file or
warning: %post(nvidia-current-utils-430.14-3.mga7.nonfree.x86_64) scriptlet failed, exit status 1
ERROR: 'script' failed for nvidia-current-utils-430.14-3.mga7.nonfree.x86_64

...

      3/3: nvidia-current-cuda-opencl
                                 #############################################
/var/tmp/rpm-tmp.NCAski: line 3: /home/iurt/rpmbuild/BUILDROOT/nvidia-current-430.14-3.mga7.nonfree.x86_64/etc/ld.so.conf.d/nvidia-current-cuda.conf: No su
/var/tmp/rpm-tmp.NCAski: line 4: /home/iurt/rpmbuild/BUILDROOT/nvidia-current-430.14-3.mga7.nonfree.x86_64/etc/ld.so.conf.d/nvidia-current-cuda.conf: No su

Lookin into package:
erewolf:~> rpm -q --scripts nvidia-current-utils
postinstall scriptlet (using /bin/sh):
# add symlinks only if x11-driver-video-nvidia-current is not installed
if [ ! -f /etc/nvidia-current/ld.so.conf ]; then
  ln -sf /usr/lib64/{drivername}/bin/nvidia-smi /home/iurt/rpmbuild/BUILDROOT/nvidia-current-430.14-3.mga7.nonfree.x86_64/usr/bin/nvidia-smi
  ln -sf /usr/lib64/{drivername}/bin/nvidia-persistenced /home/iurt/rpmbuild/BUILDROOT/nvidia-current-430.14-3.mga7.nonfree.x86_64/usr/bin/nvidia-persistenced
fi
Comment 5 Thomas Backlund 2019-05-26 21:37:37 CEST
Gah, c/p errors in post scripts..

Should be fixed in nvidia-current-430.14-4.mga7 currently building
Comment 6 Juan Magallon 2019-05-27 16:33:22 CEST
Right, working fine now.

Resolution: (none) => FIXED
Status: NEW => RESOLVED


Note You need to log in before you can comment on or make changes to this bug.