Since kernel 3.19.0 (RC7, but also final) landed into cauldron, I am having issues with bbswitch/bumblebee and the nvidia-current nonfree driver. Running simple commands like "optirun glxspheres64" causes a hard freeze (but not reliably, it does it most of the time but it also worked fine from time to time, though the discrete GPU could not be turned off anymore in such cases). I also reported this issue upstream on bumblebee's GitHub, since the issue might be a compatibility issue with kernel 3.19.0 on their side: https://github.com/Bumblebee-Project/Bumblebee/issues/632 Reproducible: Steps to Reproduce:
CC'ing Thomas, I sincerely hope that you can help me debug this because this is a huge regression for optimus users (at least for those that want to use their Nvidia GPU) a couple of weeks before the release. I'll attach relevant logs.
Priority: Normal => HighCC: (none) => tmb
Created attachment 5913 [details] /var/log/dmesg The "ACPI Warning" and "NVRM" entries seem particularly relevant to my issue, though I don't know how to interpret them.
Created attachment 5914 [details] /var/log/Xorg.0.log Might not be that relevant, bumblebee spawns applications on display :8 IIUC.
Created attachment 5915 [details] /var/log/Xorg.8.log Some interesting errors in this one. AFAIU it represents what happened when I started "optirun -b virtualgl glxspheres64" and the computer froze.
Created attachment 5916 [details] /var/log/syslog And last, the syslog snippet corresponding to this session.
Attachment 5913 mime type: application/octet-stream => text/plain
(In reply to Rémi Verschelde from comment #2) > Created attachment 5913 [details] > /var/log/dmesg > > The "ACPI Warning" and "NVRM" entries seem particularly relevant to my > issue, though I don't know how to interpret them. I'd say Acpi warning is a red herring in this case. The NVRM messages about conflicting drivers should be mostly harmless as the system usually is capable of unloading the conflicting ones when needed (In reply to Rémi Verschelde from comment #4) > Created attachment 5915 [details] > /var/log/Xorg.8.log > > Some interesting errors in this one. AFAIU it represents what happened when > I started "optirun -b virtualgl glxspheres64" and the computer froze. This one seems more relevant: (EE) /dev/dri/card0: failed to set DRM interface version 1.4: Permission denied (In reply to Rémi Verschelde from comment #5) > Created attachment 5916 [details] > /var/log/syslog > > And last, the syslog snippet corresponding to this session. And here bbswitch tried to register an already registered driver: Feb 9 12:16:57 localhost kernel: [ 148.152022] bbswitch: enabling discrete graphics Feb 9 12:16:58 localhost kernel: [ 148.867350] ------------[ cut here ]------------ Feb 9 12:16:58 localhost kernel: [ 148.867358] WARNING: CPU: 3 PID: 5614 at fs/proc/generic.c:372 proc_register+0x135/0x1c0() Feb 9 12:16:58 localhost kernel: [ 148.867360] proc_dir_entry 'driver/nvidia' already registered
CC: (none) => mageia, thierry.vignaud
Upstream says that my packaging is a bit funky: > Your configuration looks very unusual. Normally nouveau kernel module is > blacklisted, and nvidia kernel module is not loaded before bbswitch. In your > syslog, both nouveau and nvidia modules are loaded before bumblebee loads > bbswitch. I don't know what specifically can trigger freezes, but your > situation is a minefield, so I strongly recommend you to look into why modules > are loaded automatically rather than through optirun/bumblebeed, and fix that. It used to work until now, but maybe something changed in the way modules are loaded that make that my bumblebee package does not do what's needed? Should I blacklist nouveau in the bumblebee-nvidia flavour?
I added this file to the bumblebee package, and it seems to solve the hard freezes: $ cat /etc/modprobe.d/bumblebee.conf blacklist nvidia-current blacklist nouveau Now the remaining issue is that once started, the Nvidia GPU can't be powered off anymore since the nvidia module stays in use even when the process that was using it gets killed.
Severity: critical => normal
I'll close this one as fixed for now, I will see if another bug report is needed for the issue mentioned in comment 8.
As per above comment.
Status: NEW => RESOLVEDResolution: (none) => FIXED