Bug 14462 - Mageia nvidia do not work with CUDA or OpenCL but locally built with mageia script it works.
Summary: Mageia nvidia do not work with CUDA or OpenCL but locally built with mageia s...
Status: RESOLVED DUPLICATE of bug 15328
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Mageia Bug Squad
QA Contact:
URL:
Whiteboard: MGA5TOO
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-05 10:37 CET by Morgan Leijström
Modified: 2016-10-07 14:20 CEST (History)
4 users (show)

See Also:
Source RPM: nvidia-current-kernel-3.18.1-desktop-4.mga5, x86_64, Version: 340.65-8.mga5.nonfree
CVE:
Status comment:


Attachments

Description Morgan Leijström 2014-11-05 10:37:30 CET
Spawned from https://bugs.mageia.org/show_bug.cgi?id=12129#c37

Description of problem:
§ Make sure we have the right packages
§ Document in wiki how to set it up, provude useful links to BOINC site

-for both Nvidia and AMD GPUs

I am available to test in my workstation where I already run BOINC:
Mageia 5, Intel i7, Nvidia GK104 [GeForce GTX 760]

Reproducible: 

Steps to Reproduce:
Comment 1 Peter Wallace 2015-01-10 13:35:02 CET
To add to this, Have Mageia 5 installed also.
 Having doing some hunting I discovered that it will only allow detection of my AMD gpu after doing xhost local:boinc as root. Then restart the service.

I did try the X0.host trick but that failed too.

CC: (none) => worzel910

Comment 2 Morgan Leijström 2015-01-10 22:54:39 CET
Thanks for the hints Peter.
So maybe we only need to configure it correctly.
Where did you put the commands?
Comment 3 Morgan Leijström 2015-01-11 00:50:30 CET
Shooting in blindness:
I have GeForce GTX 760 using Nvidia proprietary driver "NVIDIA GeForce 400 series and later" according to MCC hardware module.
Have installed
   nvidia-current-cuda-opencl - CUDA and OpenCL libraries for NVIDIA proprietary driverâ
and
   nvidia-cuda-toolkit - NVIDIA CUDA runtime librariesâ

Tried
[root@svarten morgan]# systemctl stop boinc-client.service
[root@svarten morgan]# xhost local:boinc
non-network local connections being added to access control list
[root@svarten morgan]# systemctl start boinc-client.service

still no suitable GPU found.

[root@svarten morgan]# systemctl stop boinc-client.service
[root@svarten morgan]# gpasswd -a boinc video
Lägger till användaren boinc till gruppen video
[root@svarten morgan]# systemctl start boinc-client.service

still no suitable GPU found.

Also tried installing lib64opencl1 - OpenCL ICL Loaderâ     ( which needed  nvidia-cuda-toolkit to be uninstalled )  : No change.


This thread http://www.gpugrid.net/forum_thread.php?id=3734 mentions libcuda.so need exist, and i see the installed package nvidia-current-cuda-opencl contain /usr/lib64/nvidia-current/libcuda.so.1 (and exist on system)  - can the added ".1" confuse boinc?


There is a recent thread about nvidia in ubuntu and fedora20, 
https://devtalk.nvidia.com/default/topic/734098/linux/-boinc-ubuntu-nvidia-no-usable-gpus-found-/post/4357750/#4357750 talking about nvidia-modprobe .   The installed package x11-driver-video-nvidia-current is according to drakrpm supposed to contain /usr/lib64/nvidia-current/bin/nvidia-modprobe but it does not ...?
Comment 4 Morgan Leijström 2015-01-11 00:58:08 CET
tried without success:
# cd /usr/lib64/nvidia-current
# ln -s libcuda.so.1 libcuda.so
Comment 5 Peter Wallace 2015-01-11 01:46:24 CET
First thing is to stop the service as root with:

systemctl stop client-boinc.service

Then do 

xhost local:boinc

Then restart the service with

systemctl start client-boinc.service



Go to the boinc manager via tthe menu and look through the logs .


I did pretty much the same as you Morgan, thought it was a missing lib or whatever. It's due to not been allowed access to the driver thus the xhost trick.

Down side is it needs to be done after every reboot/power cycle.
Comment 6 Richard Houser 2015-01-11 06:38:45 CET
I've been having similar issues trying to get CUDA/OpenCL working on my Geforce 770M.  I wonder if it's the same underlying cause at a system configuration level, Morgan.  In my case, I'm on an Optimus architecture, so that sometimes throws a wrench in things.

If it's not too much of an inconvenience, could you guys try starting blender then go to File->User Preferences->System and see if you have a GPU option under the compute device selection at the bottom left?

CC: (none) => rick

Comment 7 Morgan Leijström 2015-01-11 13:54:06 CET
@Peter: that did not work for me

@Richard: No GPU
(i suppose it would show in the now blue field saying "None" above the grey list field saying "CPU".

OK thanks.
So other apps also do not find GPU - probably easier to get to know the cause?
I have no idea though.
Comment 8 Peter Wallace 2015-01-11 16:26:45 CET
@Morgan: Not a clue then, am running a AMD gpu so cant investigate , though I could see on my lads.


@Rick

Blender does not show GPU for me neither , 

Luxmark works fine with and without the xhost trick

Boinc works with the fix above but not blender.
Comment 9 Peter Wallace 2015-01-11 16:37:46 CET
It does work in blender! I needed python-opencl installing

So yes it works Rick
Comment 10 Morgan Leijström 2015-01-11 17:19:08 CET
changed temporarily to nouveau and uninstalled and reinstalled nvidia driver,
installed nvidia-current-cuda-opencl and nvidia-cuda-toolkit.
# xhost local:boinc
# gpasswd -a boinc video
-> still no useable GPU in boinc

installed python-opencl
-> still no GPU in blender.

Who is expert on GPU processing in mageia - especially Nvidia ?

# nvidia-smi
Sun Jan 11 17:09:11 2015       
+------------------------------------------------------+                       
| NVIDIA-SMI 340.65     Driver Version: 340.65         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 760     Off  | 0000:01:00.0     N/A |                  N/A |
| 29%   33C    P8    N/A /  N/A |    529MiB /  2047MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0            Not Supported                                               |
+-----------------------------------------------------------------------------+
Comment 11 Morgan Leijström 2015-01-11 22:06:45 CET
I do not know what to look for in logs, but here is some:

# journalctl -ab | grep -i -e cuda -e opencl
# journalctl -ab | grep nvidia
jan 11 16:46:11 svarten kernel: nvidia: module license 'NVIDIA' taints kernel.
jan 11 16:46:11 svarten kernel: [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 0
jan 11 16:46:21 svarten kernel: nvidia 0000:01:00.0: irq 52 for MSI/MSI-X
Comment 12 Morgan Leijström 2015-01-11 22:40:15 CET
Reading at link below it seems that some API may get blocked when "taint" happens.
http://stackoverflow.com/questions/16809378/getting-message-module-license-unspecified-taints-kernel-despite-setting-mo

So maybe this is because some license file (however it works) missing that is the cause?
So i set this bug against rpm "nvidia-current-kernel-3.18.1-desktop-4.mga5, x86_64, Version: 340.65-8.mga5.nonfree" ( I did not figure out what to put there exactly...)

Lets not care specifically for BOINC (which i originally entered this bug for) until *any* program can use nvidia GPU.
(Also tried darktable now and found OpenCL is greyed out in its settings)

Summary: Make sure BOINC can use GPU for number crunching => Fail to get CUDA or OpenCL working (because nvidia taint kernel?)
Source RPM: (none) => nvidia-current-kernel-3.18.1-desktop-4.mga5, x86_64, Version: 340.65-8.mga5.nonfree

Comment 13 Morgan Leijström 2015-01-11 22:45:27 CET
According to this forum post OpenCL was working for that user with mga5 this september, nvidia 340.58 , but not before.
https://forums.mageia.org/en/viewtopic.php?f=15&t=8711&p=53189&hilit=cuda#p53189

To test, how do i go about to downgrade to that versions exactly ?
Comment 14 Morgan Leijström 2015-01-12 21:35:43 CET
...i have some activity now in that forum thread...
Comment 15 Morgan Leijström 2015-05-20 14:30:59 CEST
It works to build locally using files from Nvidia and script from Anssi
https://bugs.mageia.org/show_bug.cgi?id=15328
Comment 16 Samuel Verschelde 2015-05-21 09:23:02 CEST
Does it mean there's something we can do?
Comment 17 Morgan Leijström 2015-05-30 09:33:26 CEST
I do not know how these things work, i just followed instructions.

I suggest to ask Anssi.

Summary: Fail to get CUDA or OpenCL working (because nvidia taint kernel?) => Mageia nvidia do not work with CUDA or OpenCL but locally built with mageia script it works.

Comment 18 Samuel Verschelde 2015-05-30 23:53:12 CEST
Anssi, would you like to comment?

CC: (none) => anssi.hannula

Samuel Verschelde 2015-06-06 16:57:48 CEST

Whiteboard: (none) => MGA5TOO

Comment 19 Yann Cantin 2015-08-03 02:49:43 CEST
boinc search for libcuda.so and libOpenCL.so : they are in nvidia-current-devel

BUT installing the package doesn't trigger ldconfig, so the lib aren't found until, some day, another package installation triggers ldconfig and update ld.so.cache...

Fix :
urpmi nvidia-current-devel
ldconfig

CC: (none) => yann.cantin

Comment 20 Morgan Leijström 2016-10-07 14:20:03 CEST
Still same problem, and that fix in comment 19 still works
 - thank you Yann !

Closing this, summing info in Bug 15328

*** This bug has been marked as a duplicate of bug 15328 ***

Status: NEW => RESOLVED
Resolution: (none) => DUPLICATE


Note You need to log in before you can comment on or make changes to this bug.