Bug 8863 - Latest kernel update makes nVidia driver fail
Summary: Latest kernel update makes nVidia driver fail
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: release_blocker normal
Target Milestone: ---
Assignee: Mageia Bug Squad
QA Contact:
URL:
Whiteboard:
Keywords:
: 8909 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-01-28 04:22 CET by Doug Laidlaw
Modified: 2013-02-26 21:45 CET (History)
13 users (show)

See Also:
Source RPM: nvidia
CVE:
Status comment:


Attachments
50-mageia.conf altered in accordance with Jim Beard's directions (644 bytes, text/plain)
2013-02-13 13:28 CET, Doug Laidlaw
Details

Description Doug Laidlaw 2013-01-28 04:22:58 CET
Description of problem:


Since upgrade to kernel rc5.1, starting nVidia driver gives a message that there is a conflict between the driver called for and the driver available.  Boot then "hangs." In failsafe mode, I am dropped to Rescue prompt.  Not being a graphical mode doesn't seem to avoid the problem.  Switching to nouveau driver allows X to start.  My Asus card is identified as a "âGT218 [GeForce 210]".  The most recent nVidia driver is selected automatically.
Manuel Hiebel 2013-01-28 18:25:41 CET

CC: (none) => tmb

Comment 1 Doug Laidlaw 2013-01-28 18:31:37 CET
Two similar cases on alt.os.linux.mageia.
Comment 2 Doug Laidlaw 2013-01-29 03:28:44 CET
(In reply to comment #1)
> Two similar cases on alt.os.linux.mageia.

BitTwister is involved in that thread and doesn't seem to have the problem.  I believe that his video card is an ATI.
Comment 3 Thomas Backlund 2013-01-29 04:26:09 CET
is there a "nokmsboot" on kernel command line ?

IF you modprobe the nvidia module manually, do you get any errors / warnings in dmesg ?
Comment 4 Doug Laidlaw 2013-01-29 05:26:28 CET
(In reply to comment #3)
> is there a "nokmsboot" on kernel command line ?
> 
Yes
>
> IF you modprobe the nvidia module manually, do you get any errors / warnings in
> dmesg ?

 Conflicts with nouveau module.  Can't rmmod nouveau because it is in use.  If I put nvidia back in xorg.conf I can't get a normal boot prompt, but will see what I can produce.  Hang on while I reboot.
Comment 5 Doug Laidlaw 2013-01-29 05:45:21 CET
Changed "nouveau" to "nvidia" in xorg.conf then rebooted into Cauldron Failsafe.  Got the same message box to the effect that it can't load the requested module because it conflicts.  Tried to modprobe nvidia, dmesg output:

[  227.795628] Disabling lock debugging due to kernel taint
[  227.817875] NVRM: The NVIDIA probe routine was not called for 1 device(s).
[  227.817882] NVRM: This can occur when a driver such as nouveau, rivafb,
NVRM: nvidiafb, or rivatv was loaded and obtained ownership of
NVRM: the NVIDIA device(s).
[  227.817888] NVRM: Try unloading the conflicting kernel module (and/or
NVRM: reconfigure your kernel without the conflicting
NVRM: driver(s)), then try loading the NVIDIA kernel module
NVRM: again.
[  227.817893] NVRM: No NVIDIA graphics adapter probed!

Couldn't rmmod nouveau -- in use.  lsmod confirmed that.
Comment 6 Rick Bailey 2013-01-29 08:00:15 CET
I found that selecting the nvidia propriety driver in MCC and letting it install and then running 

dracut -f -H --xz

to rebuild the initramfs worked in my case. I read somewhere (sorry can't remember where) that the nouveau driver didn't adhere to the blacklist.

CC: (none) => artful.codger

Comment 7 Doug Laidlaw 2013-01-29 08:15:55 CET
A recent post to alt.os.linux.mageia said the same thing.  I asked him to post the details here, but his timezone is different.  He didn't quote a link.  There are several, e.g.

https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/815803
Comment 8 Wolfgang Bornath 2013-01-29 10:13:50 CET
Same issue after installation of MGA3 Beta2. Did an installation of proprietary nvidia driver via MCC (in VT1) and ran dracut right after that. No success.

I have the same results as described in comments #1, #4, #5, except that I am on a x86_64 machine/system. So I will change platform to 'ALL'

CC: (none) => molch.b

Wolfgang Bornath 2013-01-29 10:14:02 CET

Hardware: i586 => All

Comment 9 Thomas Backlund 2013-01-29 10:27:23 CET
as a temp fix...

To get rid of nouveau from initrd you can do:

dracut -f --omit-driver nouveau

and if you want to make sure it wont load at all you can do:

mv /lib/modules/$(uname -r)/kernel/drivers/gpu/drm/nouveau/* /tmp && depmod -a
Comment 10 Wolfgang Bornath 2013-01-29 10:52:24 CET
I did the second temp fix, moving nouveau out of the way.
This does not change anything here. Still I get the message in the boot procedure about "there is a conflict between the driver called for and the driver available". Boot ends in a CLI prompt.

Xerror log shows:
 
modprobe: ERROR: could not insert 'nvidia_current': No such device
modprobe: ERROR: Error running install command for nvidia
modprobe: ERROR: could not insert 'nvidia'; Operation not permitted
(EE) Server termibnated with error (1)
xinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: server error
Comment 11 Doug Laidlaw 2013-01-29 10:59:39 CET
This is getting way beyond my expertise.  I experienced it under Cauldron.  I am running nouveau under Mga2 (have been since Christine's bug,) and it is adequate for my needs.  I can do the same with Cauldron.  I agree that the bug needs to be fixed, but I just happened to be the reporter.
Comment 12 Thomas Backlund 2013-01-29 12:10:14 CET
(In reply to comment #10)
> I did the second temp fix, moving nouveau out of the way.
> This does not change anything here. Still I get the message in the boot
> procedure about "there is a conflict between the driver called for and the
> driver available". Boot ends in a CLI prompt.
> 

Thats because there is still a nouveau in the initrd, so it still loads... you need the first "fix" too
Comment 13 Wolfgang Bornath 2013-01-29 12:52:07 CET
Ok, result:
System boots to the CLI prompt on VT1. Logging in as user and giving 'startx' starts the X server with proprietary nvidia driver.

In other words, system does not boot through to the DM (kdm) but stops at runlevel 3, needs login and then 'startx' to boot into the GUI.

Of course that's totally ok as a wörkaround, thx a lot.
Comment 14 Chris Denice 2013-01-30 00:53:08 CET
+1; Experiencing it too.

Cheers,
Chris.

CC: (none) => dirteat

Comment 15 Doug Laidlaw 2013-01-30 01:32:14 CET
Seems to be a duplicate of bug 8852.
Helge Hielscher 2013-01-30 12:25:36 CET

CC: (none) => hhielscher

Comment 16 Oden Eriksson 2013-01-30 23:25:58 CET
Just had this problem as well but "dracut -f --omit-driver nouveau" fixed it.

I have the "GeForce 9500 GT" model (according to lshw).

CC: (none) => oe

Comment 17 Doug Laidlaw 2013-01-31 01:59:21 CET
After doing the same, I still can't boot into X, but I can run startx.
Ben Bullard 2013-01-31 03:10:53 CET

CC: (none) => benbullard79

Comment 18 Manuel Hiebel 2013-02-01 01:10:44 CET
*** Bug 8909 has been marked as a duplicate of this bug. ***

CC: (none) => hoytduff

Comment 19 William Kenney 2013-02-02 22:43:29 CET
I was able to successfully install a 64-bit M3B2 and instructed
it to not install the proprietary driver. Once up and running
I attempted to install the nvidia ( proprietary ) driver and
a reboot got the following error message:

"detected a loaded display driver kernel module which conflicts
with the driver the X server is configured to use. Startup of
the X server may now fail"

And it did in the condition(s) mention above. I'm using a GeForce GT 440.

CC: (none) => wilcal.int

Comment 20 Vincent Billette 2013-02-03 14:12:56 CET
I have the same problem (DVD 64 bits, fresh install on a C2D with 4 Go RAM and a nvidia 8800 GT 512 Mo).

We have other similar reports on MLO :
http://www.mageialinux-online.org/forum/topic-14523.php#m140637

Moreover, the workaround suggested above doesn't work with me - i had to switch back to nouveau.

CC: (none) => vbillette

Comment 21 Robert Courtright 2013-02-11 15:38:10 CET
If it helps any. I installed Mageia 3 beta 2 on 5 machines all dual boots.
I had this problem only on the systems with an Nvidia chip set. All have Nvidia graphics cards. One machine has an on-board Nvidia 8200GS it won't boot at all.
I can't bring it up even to read the logs. The system with a Intel chip set you wouldn't even know their was a problem (This being a Dell). This might explain why some are not seeing any issues.

CC: (none) => rbcourt

Comment 22 Vincent Billette 2013-02-11 18:59:09 CET
The computer I use for testing has an Intel chipset (in fact it's an Intel motherboard) an Nvidia 8800 GT Graphic card, and I have the bug...
Manuel Hiebel 2013-02-12 19:59:25 CET

Priority: Normal => release_blocker

Comment 23 macxi 2013-02-13 10:47:17 CET
I installed Mageia 3 beta 2, DVD, 64 bit, on a computer with Nvidia Ge98 (GeForce 8400 GS) and after configuring the video card, after reboot, I can not access the graphical environment, nor through startx command. It gives the error:

"detected the display driver kernel module loaded Which conflicts
with the driver the X server is configured to use. Startup of
the X server may now fail "

In another installation, installed with the "Mageia 3 b2 LiveDVDKde 64bit", I accessed the graphics mode and then configures the video card, and after reboot I can not access the graphical environment through the common user, just by booting safe-mode and can only access the graphical mode as root, using the startx.

Mageia Forum: https://forums.mageia.org/en/viewtopic.php?f=15&t=4383

CC: (none) => terraagua

Comment 24 macxi 2013-02-13 11:39:54 CET
In another installation with the "Mageia 3 b2 LiveDVD Kde 64bit", after I install media, and install all updates, I could not have access to the graphical environment. Not even as root (with the command startx)
Comment 25 Doug Laidlaw 2013-02-13 12:25:17 CET
Yes, the bug is very consistent.  Being root should make no difference.  What happens if you edit xorg.conf and replace "nvidia" with "nouveau"?

They are still working on it.
Comment 26 Sander Lepik 2013-02-13 12:32:45 CET
I removed nouveau from initrd. But even that didn't help. For some reason nvidia was not loaded. After running "modprobe nvidia && service prefdm restart" i got my X running but now i have to do it every start until someone figures out why nvidia is not loaded.

CC: (none) => sander.lepik

Comment 27 Doug Laidlaw 2013-02-13 12:44:22 CET
I started this bug, but I can't really help (or I would have done something about it.)  At the moment, I have the situation you have.  I am trying to get maxci to the same position.  Better a desktop somehow, than none at all.

I can start X with startx.  Even with nVidia in xorg.conf, I can run startx, and the nouveau driver is loaded.  Bug 8852 is the same thing for those with Radeon cards.  I notice there that Arnaud Vacquier says he found a workaround from Jim Beard.  I will have a look at that, but I repeat, I don't have the knowledge to fix the bug itself.  Really, it seems to be "upstream," affecting all the distros.
Comment 28 Doug Laidlaw 2013-02-13 12:54:42 CET
It seems that Jim's workaround was this:

"Workaround:

In /etc/dracut.conf.d/50*
add

omit_dracutmodules+=" drm "

I also added 
DRACUT_SKIP_FORCED_NON_HOSTONLY=1
hostonly="yes"
and put a # in front of omit_dracutmodules+=" network 

I then used dracut --force --fstab initrd-3.8.0-desktop-0.rc4.1.mga3.img
3.8.0-desktop-0.rc4.1.mga3

and in consequence got an initrd that boots and is only appx 60MB in size."


I did the first part of that, but my /etc/dracut.conf.d/50-mageia.conf seems to be a mixture.  My kernel at the moment is 3.8.0-server-0.rc7.1.mga3

My production system is Mga2, so I can afford to give Jim's fix a try, even if I lose my desktop in Cauldron.  I shall report back.
Comment 29 Doug Laidlaw 2013-02-13 13:28:27 CET
Created attachment 3516 [details]
50-mageia.conf altered in accordance with Jim Beard's directions

OK, Jim's workaround works.  After applying it, I was able to boot into runlevel 3 as before.  The nvidia module was loaded, and there was no sign of nouveau.

I then switched to runlevel 5.  Here are the instructions, from the Fedora docs:

 How do I change the default runlevel?

systemd uses symlinks to point to the default runlevel. You have to delete the existing symlink first before creating a new one

 rm /etc/systemd/system/default.target 

Switch to runlevel 5 by default

 ln -sf /lib/systemd/system/graphical.target /etc/systemd/system/default.target 

After doing all that, I was able to boot into a graphical desktop.  Note that in addition to editing 50-mageia.conf, you need to run the command Jim gave to regenerate your initrd, and use the version number from your own /boot/ directory.
Comment 30 DariuszSki 2013-02-23 17:06:18 CET
I didn't know whether to start a new bug, or tag along with this one. I have a similar issue that appears to be the latest kernel not liking the nVidia driver. However, the issue I have is the kernel in the full install DVD of Mageia3 Beta2 works with the nVidia driver without problem, a clean install.

When updating the system with all the latest rpm's (including the kernel), the system no longer wants to boot, it hangs.

The problem kernel is:
3.8.0-2.mga3

There is an error produced on boot, which you can "okay" or will carry on by itself booting after a few seconds, before hanging the system a few seconds after that error is displayed.

"Display Driver Issue
Detected a loaded display driver kernel module which conflicts with the driver the X server is configured to use. Startup of the X server may now fail."

If I change "nvidia" for "nouveau" using safe mode in /etc/X11/xorg.conf then Mageia 3 Beta 2 loads as normal with the latest kernel, no errors.

The kernel that is on the DVD for M3B2 is:
3.8.0-desktop-0.rc4.1.mga3  (x86_64)

CC: (none) => linuxstuff

Comment 31 Doug Laidlaw 2013-02-23 21:11:55 CET
Your issue is probably this bug.  It is the same as Bug 8852, which is the same thing again, for the Radeon card.  The problem itself is "upstream" affecting quite a few distros.  It isn't the kernel itself, but the initrd.  I had it running quite happily with Jim Beard's fix for the Radeon (comment 29) but last night's kernel upgrade put me back to the start.  I haven't yet had a chance to run the fix again.  I have been following Bug 8852, where all the action is to fix it.

I haven't finished downloading Beta2, but the bug probably came in later than the bare DVD.
Comment 32 Doug Laidlaw 2013-02-23 21:47:34 CET
Applying the patch and rebuilding my initrd did not fix the problem, but brought back the conflict message box.  Frankly, the nouveau driver does all that I want.  I have an obligation to maintain this bug, otherwise I would just leave the system with nouveau selected.
Comment 33 Thomas Backlund 2013-02-24 20:41:04 CET
Hi,

I have now found the problem...

turns out dracut parsing of udev rules fails to detect need for grep, so it does not get added to the initrd, making the udev rule checking for "nokmsboot" fail and we get this mess.


you can confirm it is missing with: lsinitrd | grep grep


and add grep to initrd with:

dracut -f -I /ust/bin/grep 


after that you can reconfigure your system to use nvidia or fglrx driver and it should work properly after reboot
Comment 34 Thomas Backlund 2013-02-24 20:43:15 CET
doh, a typo... it should obviously be:

dracut -f -I /usr/bin/grep
Comment 35 Thomas Backlund 2013-02-24 21:21:53 CET
a fixed dracut-025-5.mga3 is now out so just install that, and regenerate initrd  and it should just work again
Comment 36 Doug Laidlaw 2013-02-25 00:51:04 CET
Thanks Thomas. The package doesn't seem to have reached my mirror yet.  When it does, I will report back.
Comment 37 Doug Laidlaw 2013-02-25 04:49:08 CET
Still having problems.

The update arrived.  I accepted the 50-mageia.conf that resulted, regenerated initrd using the command in Jim's fix adapted for my kernel, then changed "nouveau" to "nvidia" in xorg.conf.  Still getting the message box that the nouveau driver has been installed first.  I notice thast it is now working for Radeon, and it is essentially the same bug.  Have I missed something?
Comment 38 Doug Laidlaw 2013-02-25 08:09:57 CET
Success!

Removed 3 previous kernels that I had overlooked while the bug was present.

Su -'d to root (su -)

Ran "dracut -f" with no arguments.  Noticed that the drm module was included.

Set xorg.conf to "nvidia"

Rebooted, the nvidia splash screen came up and I was able to log in.

A bit of confirmation from at least one other nvidia user would be appreciated, no doubt.
Comment 39 DariuszSki 2013-02-25 09:38:24 CET
On my setup as I had a older working kernel+nVidia, and latest kernel which had the nVidia fault, I did not have to change xorg.conf as it was already using "nvidia", so what I did was.

1) Update packages (with working kernel / nVidia combo).
2) Re-boot into safe mode (uses latest kernel with nVidia problem).
3) Run "dracut -f"
4) Reboot.

The system now boots using the nVidia card without problem (nVidia splash screen showed on boot) on the latest kernel. So from my point of view, the dracut solution works.
Oden Eriksson 2013-02-25 09:48:31 CET

CC: oe => (none)

Comment 40 Sander Lepik 2013-02-25 11:08:30 CET
Marking as fixed..

Status: NEW => RESOLVED
Resolution: (none) => FIXED

Comment 41 Vincent Billette 2013-02-26 21:44:08 CET
I confirm that too. The bug seems to be solved.
Comment 42 William Kenney 2013-02-26 21:45:29 CET
Confirmed fixed here.

Note You need to log in before you can comment on or make changes to this bug.