Bug 8773 - nouveau loaded instead of nvidia (was nvidia 304 driver fails to load on kernel 3.8.0-0.rc4.1)
Summary: nouveau loaded instead of nvidia (was nvidia 304 driver fails to load on kern...
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: x86_64 Linux
Priority: release_blocker normal
Target Milestone: ---
Assignee: Mageia Bug Squad
QA Contact:
URL:
Whiteboard:
Keywords:
: 9216 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-01-22 05:35 CET by James Kerr
Modified: 2013-03-03 12:13 CET (History)
9 users (show)

See Also:
Source RPM: drakxtools
CVE:
Status comment:


Attachments
Output of XFdrake and 'rpm -qa *\nvidia\* *\kernel\*|sort;uname -r;lsmod |grep -e nvidia -e nouveau;cat /etc/X11/xorg.conf' (18.68 KB, text/plain)
2013-01-25 07:43 CET, Ben Bullard
Details
Output of dmesg and /var/log/Xorg.0.log (102.32 KB, text/plain)
2013-01-25 07:46 CET, Ben Bullard
Details
/var/log/nvidia-installer.log (931 bytes, text/x-log)
2013-01-26 20:00 CET, Wilbert van Bakel
Details
nokmsboot not present (617 bytes, text/plain)
2013-03-01 15:49 CET, Wilbert van Bakel
Details
Grub2.cfg (45.40 KB, application/octet-stream)
2013-03-01 16:08 CET, Wilbert van Bakel
Details

Description James Kerr 2013-01-22 05:35:47 CET
The nvidia 304 driver does not appear to load on the latest kernel. On a clean install or using drakx11 in an installed system, it appears to be installed but on re-boot, a message is displayed that the kernel module cannot be found and the nouveau driver is used instead.

(The nvidia driver is working fine on  Mageia 2 on this system, using the nvidia-current driver.)


# uname -r
3.8.0-desktop-0.rc4.1.mga3

# lspcidrake -v | grep Card
Card:NVIDIA GeForce 6100 to GeForce 7950: NVIDIA Corporation|C68 [GeForce 7050 PV / nForce 630a] [DISPLAY_VGA] (vendor:10de device:053b subv:1019 subd:2609) (rev: a2)

# lsmod | grep video
video                  19154  1 nouveau

# rpm -qa | grep nvidia
nvidia304-kernel-3.8.0-desktop-0.rc4.1.mga3-304.64-8.mga3.nonfree
nvidia304-kernel-desktop-latest-304.64-8.mga3.nonfree
dkms-nvidia304-304.64-4.mga3.nonfree
x11-driver-video-nvidia304-304.64-4.mga3.nonfree
nvidia304-doc-html-304.64-4.mga3.nonfree

# rpm -qa | grep kernel
nvidia304-kernel-3.8.0-desktop-0.rc4.1.mga3-304.64-8.mga3.nonfree
kernel-userspace-headers-3.8.0-0.rc4.1.mga3
nvidia304-kernel-desktop-latest-304.64-8.mga3.nonfree
kernel-desktop-3.8.0-0.rc4.1.mga3-1-1.mga3
kernel-desktop-latest-3.8.0-0.rc4.1.mga3
kernel-firmware-nonfree-20121225-1.mga3.nonfree
kernel-desktop-devel-latest-3.8.0-0.rc4.1.mga3
kernel-firmware-20121225-1.mga3
kernel-desktop-devel-3.8.0-0.rc4.1.mga3-1-1.mga3


In the Forum, two other users (using different chips) report similar problems with this driver/kernel combination:

https://forums.mageia.org/en/viewtopic.php?f=15&t=4240

According to nvidia, the 304 driver is the correct one for all three chips.
Comment 1 Manuel Hiebel 2013-01-22 12:20:39 CET
according to http://svnweb.mageia.org/packages?view=revision&revision=344810 it should

CC: (none) => tmb

Comment 2 Thomas Backlund 2013-01-22 15:05:48 CET
What does "dkms status" give in response
Comment 3 James Kerr 2013-01-22 16:16:07 CET
# dkms status
nvidia304, 304.64-4.mga3.nonfree, 3.8.0-desktop-0.rc4.1.mga3, x86_64: installed 
nvidia304, 304.64-4.mga3.nonfree, 3.8.0-desktop-0.rc4.1.mga3, x86_64: installed-binary from 3.8.0-desktop-0.rc4.1.mga3
Comment 4 Ben Bullard 2013-01-25 07:43:32 CET
Created attachment 3432 [details]
Output of XFdrake and 'rpm -qa *\nvidia\*  *\kernel\*|sort;uname -r;lsmod |grep -e nvidia -e nouveau;cat /etc/X11/xorg.conf'

Same problem in fresh install of Mageia 3 Beta 1 x86_64. Attached is cli output of XFdrake and 'rpm -qa *\nvidia\*  *\kernel\*|sort;uname -r;lsmod |grep -e nvidia -e nouveau;cat /etc/X11/xorg.conf'.
Comment 5 Ben Bullard 2013-01-25 07:46:09 CET
Created attachment 3433 [details]
Output of dmesg and /var/log/Xorg.0.log
Comment 6 Ben Bullard 2013-01-25 07:47:32 CET
Comment #5 is output after reboot after running XFdrake to install 'nvidia' driver.

CC: (none) => benbullard79

Comment 7 Ben Bullard 2013-01-25 07:56:45 CET
# dkms status
nvidia304, 304.64-4.mga3.nonfree, 3.8.0-desktop-0.rc4.1.mga3, x86_64: installed 
nvidia304, 304.64-4.mga3.nonfree, 3.8.0-desktop-0.rc4.1.mga3, x86_64: installed-binary from 3.8.0-desktop-0.rc4.1.mga3
Comment 8 Ben Bullard 2013-01-25 10:37:46 CET
Possible work around:

1. Edit xorg.conf to use 'vesa' driver
2. Edit menu.lst adding 'blacklist=nouveau xdriver=vesa' to kernel command line
3. Reboot 
4. Run XFdrake as root and select proprietary driver
5. Immediately run 'modprobe nvidia'
6. Do not reboot, log out and restart xserver, then log in
7. As root check to see which driver with 'lsmod |grep -e nvidia -e nouveau'.
8. Here at least 'nvidia' is working
9. Remove 'blacklist=nouveau xdriver=vesa' from kernel command line in menu.lst
10. Reboot
11. Again check to see that nvidia is being used, here it is.

How I got here was more convoluted and probably involved many redundant steps. I'm attempting to condense steps to closer to the minimum necessary for others. I hope this may help someone else. If I have anything wrong don't hesitate to correct.

Ben
Comment 9 Martin Spiegel 2013-01-26 01:13:15 CET
(In reply to comment #8)
> Possible work around:
> 
> 1. Edit xorg.conf to use 'vesa' driver
> 2. Edit menu.lst adding 'blacklist=nouveau xdriver=vesa' to kernel command line
> 3. Reboot 
> 4. Run XFdrake as root and select proprietary driver
> 5. Immediately run 'modprobe nvidia'
> 6. Do not reboot, log out and restart xserver, then log in
> 7. As root check to see which driver with 'lsmod |grep -e nvidia -e nouveau'.
> 8. Here at least 'nvidia' is working
Up to step 8 the work around is successful on my system (GeForce 7050 PV/nForce 630a)
> 9. Remove 'blacklist=nouveau xdriver=vesa' from kernel command line in menu.lst
> 10. Reboot
> 11. Again check to see that nvidia is being used, here it is.
Steps 9-11 bring back the nouveau driver on my system. I therefore repeated steps 1-8 and kept the 'blacklist=nouveau' entry in the kernel command line but then again the nouveau driver is loaded after reboot. I had to uninstall x11-driver-video-nouveau to make the switch to the nvidia driver persistent (however, now a strange error message shows up during boot telling me that the video driver is automatically switched to nvidia because the proprietary module nvidia was not found...)

Martin

CC: (none) => mnspiegel

Comment 10 Wilbert van Bakel 2013-01-26 20:00:07 CET
Created attachment 3439 [details]
/var/log/nvidia-installer.log

3.8.0-desktop-0.rc4.1.mga3/build/include/linux/version.h missing.

make depends on the kernel source doesn't work anymore
Comment 11 Ben Bullard 2013-01-27 05:12:09 CET
I can confirm '3.8.0-desktop-0.rc4.1.mga3/build/include/linux/version.h missing' as reported in Comment # 10. My /var/log/nvidia-installer.log is exactly the same as attachment 3439 [details].
Comment 12 Wilbert van Bakel 2013-01-29 17:43:39 CET
Here are some more observations that also apply to 3.8.0-desktop-0.rc5.1:

I created /etc/dracut.conf.d/99-local.conf with the following content:
=====
# dracut modules to omit
omit_drivers+=" nouveau"

# additional kernel modules to the default
add_drivers+=" nvidia304"
=====

I added xdriver=nvidia304 to the kernel line in /boot/grub/menu.lst

1: I boot in single mode (add single add the end of kernel line
2: lsmod|grep nou && lsmod|grep nv show that nvidia is loaded and nouveau is not loaded
3: init 3
4: error message: "nvidia304 already installed"
5: nvidia driver is unloaded and nouveau is loaded before init 3 finishes.
my console resolution changes from 1024x768 to 1280x1024.

Point: NVidia gets unloaded during the boot process.

CC: (none) => M8R-3t0f541

Comment 13 Ben Bullard 2013-01-29 22:55:58 CET
The work around I described in comment # 8 no longer works.
Comment 14 Martin Spiegel 2013-01-31 01:59:48 CET
(In reply to comment #13)
> The work around I described in comment # 8 no longer works.

First rebuild initrd with
dracut -f --omit-driver nouveau
and then remove the nouveau module with
mv /lib/modules/$(uname -r)/kernel/drivers/gpu/drm/nouveau/* /tmp && depmod -a
After that you can apply the workaround described in comment #8. This way I could switch from nouveau to nvidia after the kernel update to 3.8.0-0.rc5.1.
I "borrowed" this solution from here:
https://bugs.mageia.org/show_bug.cgi?id=8863#c9
Comment 15 Martin Spiegel 2013-01-31 02:53:11 CET
(In reply to comment #14)
> (In reply to comment #13)
> > The work around I described in comment # 8 no longer works.
> 
> First rebuild initrd with
> dracut -f --omit-driver nouveau
> and then remove the nouveau module with
> mv /lib/modules/$(uname -r)/kernel/drivers/gpu/drm/nouveau/* /tmp && depmod -a
> After that you can apply the workaround described in comment #8. This way I
> could switch from nouveau to nvidia after the kernel update to 3.8.0-0.rc5.1.
> I "borrowed" this solution from here:
> https://bugs.mageia.org/show_bug.cgi?id=8863#c9

I forgot to mention that I also had to remove x11-driver-video-nouveau before rebooting the system to make the switch to the nvidia driver persistent. Strangely enough when choosing the proprietary nvidia driver in XFdrake it always installs x11-driver-video-nouveau together with the proprietary nvidia stuff.  Therefore it does *not* help to remove x11-driver-video-nouveau before using XFdrake.
Comment 16 Wilbert van Bakel 2013-01-31 14:57:06 CET
I think the problem is dkms.

Yesterday I reinstalled 3b2 and uninstalled dkms\*,I was hoping that I could install the nvidia kernel module just like any other distro, but a dependency to dkms shows up.

I also notice that /etx/X11/xorg.conf gets rewritten during each reboot and my modifications to nvidia driver are replaced to nouveau driver.

Until this junk is fixed I give up on beta testing.
The nouveau driver has its own set of problems that I can avoid with any other distro.
Comment 17 claire robinson 2013-01-31 18:13:04 CET
Cauldron is inherently unstable. It is not a stable rolling release, it is the development release where bugs like this are to be expected as work is undertaken to build the next stable Mageia release.

Mageia 2 is the current stable release. Mageia 3, which cauldron will eventually become, is due in March.
Comment 18 Ben Bullard 2013-02-01 20:49:43 CET
The steps Martin Spiegel mentions in comments 14 and 15 work here in combination with work around described in comment 8.

Thank you Martin.
Comment 19 Jim Dines 2013-02-03 16:16:59 CET
When doing a plymouth-set-default-them with --rebuild-initrd the script at /usr/lib/dracut/modules.d/50drm/module-setup.sh looks in the kernel tree and finds nouveau (find_kernel_modules_by_path drivers/gpu/drm) and puts it in the initrd.  It needs to look for the nvidia driver in the dkms tree first, and add that one instead. (I think it needs to do: find_kernel_modules_by_path ../dkms/drivers/char/drm/ but I'm not certain that works either.

In any case nouveau is like a bad illness that just won't go away as of today :-(

CC: (none) => jdines

Manuel Hiebel 2013-02-12 19:58:56 CET

Priority: Normal => release_blocker

Comment 20 Manuel Hiebel 2013-02-27 22:54:15 CET
is this bug still valid with recent update ?

https://bugs.mageia.org/show_bug.cgi?id=8863#c35

Keywords: (none) => NEEDINFO

Comment 21 Wilbert van Bakel 2013-02-28 01:24:01 CET
I did a reinstallation of Mageia-3 beta2 and updated to 'recent'.

After that I went to the Mageia Control Center and 'setup the graphic server'.
I let it install the proprietary driver, I notice that 'Automatic start the graphical interface (Xorg) upon boot' is unselected.
I don't perform the test (it would fail because nouveau is loaded).

Boot up occurs in text mode.
During reboot I notice the dkms message that nvidia driver is already installed.
Then the console switches to graphic mode and soon I get the display manager.

lsmod shows nouveau loaded and no nvidia.
Comment 22 Martin Spiegel 2013-02-28 22:53:37 CET
(In reply to Manuel Hiebel from comment #20)
> is this bug still valid with recent update ?
> 
After today's update to kernel-desktop-3.8.0-3 and nvidia304-kernel-desktop-3.8.0-3 the problem seems to be fixed! My system is configured to use the nvidia driver as described in comments 16 and 17. After updating and rebooting, the system starts with the nouveau driver but does not complete the graphical boot process. Instead, a message in text mode appears, telling that the system must be rebooted due to a change of the graphics driver. After clicking OK the system rebooted and loaded... the nvidia driver! This is the first time that the graphics configutration of my Mageia Cauldron installation "survived" a kernel update and the proprietary nvidia driver was loaded successfully without manually kicking out the nouveau driver  The only minor thing which still persists is this bogus error message just before the kdm login screen telling me that "The display driver has been switched automatically to nvidia. Reason: The proprietary kernel driver for the nvidia X.org-driver was not found".
What I did not try yet was to switch from the nouveau driver to the proprietary nvidia driver.
Comment 23 Manuel Hiebel 2013-02-28 22:57:37 CET
(In reply to Martin Spiegel from comment #22)
> (In reply to Manuel Hiebel from comment #20)
> > is this bug still valid with recent update ?
> > 
> After today's update to kernel-desktop-3.8.0-3 and
> nvidia304-kernel-desktop-3.8.0-3 the problem seems to be fixed! My system is
> configured to use the nvidia driver as described in comments 16 and 17.
> After updating and rebooting, the system starts with the nouveau driver but
> does not complete the graphical boot process. Instead, a message in text
> mode appears, telling that the system must be rebooted due to a change of
> the graphics driver. After clicking OK the system rebooted and loaded... the
> nvidia driver! This is the first time that the graphics configutration of my
> Mageia Cauldron installation "survived" a kernel update and the proprietary
> nvidia driver was loaded successfully without manually kicking out the
> nouveau driver  The only minor thing which still persists is this bogus
> error message just before the kdm login screen telling me that "The display
> driver has been switched automatically to nvidia. Reason: The proprietary
> kernel driver for the nvidia X.org-driver was not found".
> What I did not try yet was to switch from the nouveau driver to the
> proprietary nvidia driver.


(In reply to Wilbert van Bakel from comment #21)
> I did a reinstallation of Mageia-3 beta2 and updated to 'recent'.
> 
> After that I went to the Mageia Control Center and 'setup the graphic
> server'.
> I let it install the proprietary driver, I notice that 'Automatic start the
> graphical interface (Xorg) upon boot' is unselected.
> I don't perform the test (it would fail because nouveau is loaded).
> 
> Boot up occurs in text mode.
> During reboot I notice the dkms message that nvidia driver is already
> installed.
> Then the console switches to graphic mode and soon I get the display manager.
> 
> lsmod shows nouveau loaded and no nvidia.


ok so still a bug somewhere

Keywords: NEEDINFO => (none)
CC: (none) => anssi.hannula, thierry.vignaud
Summary: nvidia 304 driver fails to load on kernel 3.8.0-0.rc4.1 => nouveau loaded instead of nvidia (was nvidia 304 driver fails to load on kernel 3.8.0-0.rc4.1)

Comment 24 Martin Spiegel 2013-03-01 00:47:48 CET
> ok so still a bug somewhere
Quick follow-up:
1. Switching from nvidia304 to nouveau using MCC -> works
2. Switching from nouveau to nvidia304 using MCC -> does not work, system took ages to boot and loaded the nouveau driver in the end (at least I ended up at the kdm login screen). However, getting back the nvidia driver is much more simpler than before: After installation of the prorietary driver using MCC do not reboot. Instead, launch rpmdrake and uninstall x11-driver-video-nouveau then reboot. Result: nvidia driver is loaded during next boot and the configuration will survive following reboots. It is no longer necessary to rebuild initrd without the nouveau driver.
Comment 25 Martin Spiegel 2013-03-01 00:54:36 CET
(In reply to Wilbert van Bakel from comment #21)
>  After that I went to the Mageia Control Center and 'setup the graphic
> server'.
> I let it install the proprietary driver, I notice that 'Automatic start the
> graphical interface (Xorg) upon boot' is unselected.

I can confirm this and it is also true when switching from nvidia to nouveau. You always have to select the option manually.
Comment 26 Wilbert van Bakel 2013-03-01 01:02:35 CET
(In reply to Manuel Hiebel from comment #23)
> ok so still a bug somewhere

My observations:

No matter what I try:
1. Initrd without nouveau, grub2 in text mode, kernel in text mode and blacklist.conf, the nouveau driver gets loaded.
If the console is in textmode, then it's easy to unload the nouveau driver.

Another observation is:
2. Once the nouveau driver gets loaded (and graphic mode console is enabled) the xorg.conf gets reverted to nouveau driver.
I verified that the driver was nvidia after draxX11 hardware config and before boot.
Only when booting in text mode console the xorg.conf seems respected, but still no nvidia304 loaded at boot time.

3. I observe that the system seems to load nouveau before dkms gets a chance to build and load nvidia304.
Comment 27 Wilbert van Bakel 2013-03-01 01:29:42 CET
(In reply to Martin Spiegel from comment #24)
> Uninstall x11-driver-video-nouveau

This seems to be a significant step in the procedure to get nvidia304 loaded as default Xorg driver.
Comment 28 Sander Lepik 2013-03-01 09:02:21 CET
Martin, which grub are you using?

Wilbert, did you check that grub2 is reconfigured to have nokmsboot on kernel command line?

CC: (none) => sander.lepik

Comment 29 Martin Spiegel 2013-03-01 10:36:57 CET
(In reply to Sander Lepik from comment #28)
> Martin, which grub are you using?
> 
Grub-legacy
Comment 30 Sander Lepik 2013-03-01 10:50:47 CET
(In reply to Martin Spiegel from comment #29)
> (In reply to Sander Lepik from comment #28)
> > Martin, which grub are you using?
> > 
> Grub-legacy

And nokmsboot is present in menu.lst where needed?
Comment 31 Wilbert van Bakel 2013-03-01 15:49:43 CET
Created attachment 3567 [details]
nokmsboot not present

There is also a grub.rpmnew with 1 line added:
GRUB_THEME=/boot/grub2/themes/maggy/theme.txt
Comment 32 Wilbert van Bakel 2013-03-01 15:54:13 CET
(In reply to Sander Lepik from comment #28)
> Wilbert, did you check that grub2 is reconfigured to have nokmsboot on
> kernel command line?

Yes, I verified, nokmsboot is not added, see /etc/default/grub attached. I also tested with adding nokmsboot myself.

grub2/grub.cfg:

set gfxpayload=text
linux	/boot/vmlinuz-desktop root=UUID=42b63c30-0a29-4f1c-954a-f0eebfce6930 ro  splash
Comment 33 Wilbert van Bakel 2013-03-01 16:08:13 CET
Created attachment 3568 [details]
Grub2.cfg

This file is copied right after MCC video hardware setup is finished and before reboot.
Nokmsboot is present for different partitions, but not for the Mageia-3 partition.
Comment 34 Jim Dines 2013-03-01 16:49:46 CET
There is no need for nokmsboot

Right now one needs to remove x11-driver-nouveau when switching to nvidia.  The blacklist lines in the modprobe conf system are not being respected, as they are currently ignored. In a correct system you should be able to have the x11-driver-nouveau package installed alongside nvidia and the scripts should use blacklist nouveau, etc. in the modprobe conf system to handle what gets loaded.

Whatever else, adding a blacklist nouveau line should mean that nouveau NEVER gets loaded, even if it means X won't start.  To do otherwise is to ignore the modrpobe conf conventions.  Please do NOT try to be smarter than the system admin, as they may add a blacklist line because they WANT to have the system fail and are full well aware that it will.  I was appauled when I tried to troubleshoot this issue and blacklist was ignored.

Don't worry about users adding it and temporarily breaking their X Windows system.  The ones who know enough to add the line already understand the repurcussions.

The major issue right now, other than the above, is that at boot time one gets an error message saying that the nvidia module is not present so it is using the nvidia module instead.  That is not a spelling error on my part.  The scripts are literally complaining that the nvidia module, which is in fact loaded at boot time because there is no need for a nokmsboot parameter, are complaing when they should be happy.

I hope this helps.
Comment 35 Wilbert van Bakel 2013-03-01 17:53:00 CET
(In reply to Jim Dines from comment #34)

> The scripts are literally complaining that the nvidia module, which is in
> fact loaded at boot time because there is no need for a nokmsboot parameter,
> are complaing when they should be happy.

This is mentioned in comment #12:
"NVidia gets unloaded during the boot process."
Comment 36 Jim Dines 2013-03-01 19:04:52 CET
The nvidia module gets unloaded during the transition from runlevel 3 to 5, but that is tangential to the point.  It is GREAT that the script notices that and loads it.  It is TERRIBLE that a window pops up to tell me and then waits for me to click OK when the system can successfully load the driver I have set in xorg.conf. If I asked for nvidia, and you are giving me nvidia, just do it.  Don't tell me about it and halt the boot process. 

It is good from a debugging standpoint or we might not know that it gets loaded, unloaded, and then loaded again.  For release, however, the system should never pop up that message box.
Comment 37 Jim Dines 2013-03-01 19:19:57 CET
A small clarification:  It is a good idea to tell the SYSADMIN about the fact that the nvidia driver not being there when expected, but not the user.  In other words, don't pop up a message box, but DO send a message to syslog.
Comment 38 Thomas Backlund 2013-03-01 20:13:22 CET
(In reply to Jim Dines from comment #34)
> There is no need for nokmsboot
> 

Yes there is.

display_driver_helper needs it to manage correct driver loading.

> Right now one needs to remove x11-driver-nouveau when switching to nvidia. 

Nope.

It works with that installed.

the display_driver_helper just stopped working due to a dracut bug missing the need for adding grep in initrd.


(In reply to Jim Dines from comment #36)
> The nvidia module gets unloaded during the transition from runlevel 3 to 5,
> but that is tangential to the point.  It is GREAT that the script notices
> that and loads it.  It is TERRIBLE that a window pops up to tell me and then
> waits for me to click OK when the system can successfully load the driver I
> have set in xorg.conf. If I asked for nvidia, and you are giving me nvidia,
> just do it.  Don't tell me about it and halt the boot process. 


The only time that message normally is shown is when something goes wrong so it would have switch from proprietary driver to free driver.
(iirc there is a 60 sec delay Before it continues by itself)

In your case I guess it gets in trouble because of missing "nokmsboot" parameter
on kernel command line.
Comment 39 Martin Spiegel 2013-03-01 22:08:07 CET
(In reply to Sander Lepik from comment #30)
> (In reply to Martin Spiegel from comment #29)
> > (In reply to Sander Lepik from comment #28)
> > > Martin, which grub are you using?
> > > 
> > Grub-legacy
> 
> And nokmsboot is present in menu.lst where needed?

Yes it is present
Comment 40 Martin Spiegel 2013-03-01 22:11:56 CET
(In reply to Thomas Backlund from comment #38)
> In your case I guess it gets in trouble because of missing "nokmsboot"
> parameter
> on kernel command line.
At least on my system the nomkmsboot parameter is present in the menu.lst entry and I *do* get that error message that the system switches to the nvidia driver due to missing nvidia driver.
Comment 41 Martin Spiegel 2013-03-01 23:11:49 CET
(In reply to Martin Spiegel from comment #40)
> nomkmsboot
ups, stupid typo... nokmsboot of course...
Comment 42 Jim Dines 2013-03-01 23:46:24 CET
(In reply to Thomas Backlund from comment #38)
> (In reply to Jim Dines from comment #34)
> > There is no need for nokmsboot
> > 
> 
> Yes there is.

It may be because I do not have the nouveau x11-driver installed, but I definitely don't need nokmsboot.

> 
> display_driver_helper needs it to manage correct driver loading.
> 
> > Right now one needs to remove x11-driver-nouveau when switching to nvidia. 
> 
> Nope.

Unless things have been recently fixed correctly, in which case this bug would be closed, right?  (Note that I said "right now", not "when things are fixed")

> 
> It works with that installed.
> 

It doesn't even work WITHOUT it installed (See below)

> the display_driver_helper just stopped working due to a dracut bug missing
> the need for adding grep in initrd.
> 

We are probably seeing different behavior because I have x11-driver-nouveau removed.  I have updated earlier this morning, but I'm not about to add it back at this point.  Everything works great for me but the annoying message (see below)
> 
> (In reply to Jim Dines from comment #36)
> > The nvidia module gets unloaded during the transition from runlevel 3 to 5,
> > but that is tangential to the point.  It is GREAT that the script notices
> > that and loads it.  It is TERRIBLE that a window pops up to tell me and then
> > waits for me to click OK when the system can successfully load the driver I
> > have set in xorg.conf. If I asked for nvidia, and you are giving me nvidia,
> > just do it.  Don't tell me about it and halt the boot process. 
> 
> 
> The only time that message normally is shown is when something goes wrong so
> it would have switch from proprietary driver to free driver.
> (iirc there is a 60 sec delay Before it continues by itself)
> 
> In your case I guess it gets in trouble because of missing "nokmsboot"
> parameter
> on kernel command line.

No.  I can add or remove nokmsboot with no discernible change in behavior.  nvidia loads fine in single mode, and stays loaded in runlevel 3.  It unloads and then gets re-loaded in the transition to runlevel 5, where I get the annoying message that it can't use nvidia so it is using nvidia instaed.  The behavior is identical with or without nokmsboot on the grub config line.

Is nomodeset needed?
Comment 43 Thomas Backlund 2013-03-02 00:07:33 CET

Ok, harddrake-15.24.1-1.mga3 with a fix for nvidia304 driver. 

Please install that, reconfigure your system to use the proprietary driver, 
and reboot.

does the system now work ?
Comment 44 Anssi Hannula 2013-03-02 01:02:28 CET
(In reply to James Kerr from comment #0)
> The nvidia 304 driver does not appear to load on the latest kernel. On a
> clean install or using drakx11 in an installed system, it appears to be
> installed but on re-boot, a message is displayed that the kernel module
> cannot be found and the nouveau driver is used instead.
> 
> (The nvidia driver is working fine on  Mageia 2 on this system, using the
> nvidia-current driver.)

There is an nvidia module name list in /usr/share/harddrake/service_harddrake that is missing the name "nvidia304.ko", causing harddrake to think to that the NVIDIA kernel module is missing.

This is now fixed in harddrake 15.24.1. It seems Thomas Backlund fixed it just before I was able to do it, while I writing this comment :)

There should be a way to disable this behavior completely via /etc/sysconfig/harddrake2/service.conf. Unfortunately that does not currently seem the case. I've opened bug #9231 for the issue.

In addition, the harddrake driver switch is exceptionally dumb here, because the correct nvidia driver is actually _already loaded_! harddrake should check that first before doing stuff like this. I've opened bug #9232 for the issue.



(In reply to Martin Spiegel from comment #9)
> (however, now a strange error message shows up during boot telling me that
> the video driver is automatically switched to nvidia because the proprietary
> module nvidia was not found...)

Indeed. The harddrake service notices incorrectly that the nvidia kernel module is missing, and switches the driver to the 'nvidia304', without realizing that is already the case.

Switching from nvidia* to any nvidia* is very questionable, let alone from nvidia304 to nvidia304. This should of course not happen, and I've opened bug #9234 for the issue.

Moreover, this should message should be clarified to show more specific names to be clear, e.g. "nvidia304" etc. I've opened bug #9235 for the issue.



(In reply to Wilbert van Bakel from comment #10)
> Created attachment 3439 [details]
> /var/log/nvidia-installer.log
> 
> 3.8.0-desktop-0.rc4.1.mga3/build/include/linux/version.h missing.
> 
> make depends on the kernel source doesn't work anymore

You need to have the kernel-devel package installed for your kernel (kernel-source is not the right package to install) for the proprietary installer to work.



(In reply to Wilbert van Bakel from comment #12)
> 5: nvidia driver is unloaded and nouveau is loaded before init 3 finishes.
> my console resolution changes from 1024x768 to 1280x1024.
> 
> Point: NVidia gets unloaded during the boot process.

This happens because these events:
1. Harddrake service notices incorrectly that the nvidia kernel module is not found and switches xorg.conf over to "nouveau" driver.
2. After that, the harddrake service runs display_driver_helper to check for conflicts between loaded drivers and xorg.conf, and it notices that "nvidia" is already loaded but "nouveau" is configured in xorg.conf, so it unloads the "nvidia" module. Otherwise X would be unable to start.

Event 1 no longer happens in fixed harddrake 15.24.1.



(In reply to Jim Dines from comment #34)
> Right now one needs to remove x11-driver-nouveau when switching to nvidia. 
> The blacklist lines in the modprobe conf system are not being respected, as
> they are currently ignored. In a correct system you should be able to have
> the x11-driver-nouveau package installed alongside nvidia and the scripts
> should use blacklist nouveau, etc. in the modprobe conf system to handle
> what gets loaded.
> 
> Whatever else, adding a blacklist nouveau line should mean that nouveau
> NEVER gets loaded, even if it means X won't start.  To do otherwise is to
> ignore the modrpobe conf conventions.  Please do NOT try to be smarter than
> the system admin, as they may add a blacklist line because they WANT to have
> the system fail and are full well aware that it will.  I was appauled when I
> tried to troubleshoot this issue and blacklist was ignored.
> 
> Don't worry about users adding it and temporarily breaking their X Windows
> system.  The ones who know enough to add the line already understand the
> repurcussions.

I absolutely agree that blacklisting in modprobe.conf should be respected. However, in this case the loading is not done by us (i.e. by Mageia scripts). The X.org server automatically runs "modprobe nouveau" if the nouveau driver is not loaded when starting X server with the nouveau driver.
"modprobe nouveau" does not respect blacklists, but "modprobe -b nouveau" does. I've opened bug #9236 for investigating with upstream whether "-b" option should be added to the relevant X.org server code.



(In reply to Jim Dines from comment #36)
> The nvidia module gets unloaded during the transition from runlevel 3 to 5,
> but that is tangential to the point.  It is GREAT that the script notices
> that and loads it.  It is TERRIBLE that a window pops up to tell me and then
> waits for me to click OK when the system can successfully load the driver I
> have set in xorg.conf. If I asked for nvidia, and you are giving me nvidia,
> just do it.  Don't tell me about it and halt the boot process. 

Yes, obviously it should not switch to itself due to missing driver. This no longer happens with harddrake 15.24.1 as nvidia304 is handled properly, but the logic should still be fixed. As noted above, I've opened bug #9234 for the issue.

> It is good from a debugging standpoint or we might not know that it gets
> loaded, unloaded, and then loaded again.  For release, however, the system
> should never pop up that message box.

That message box should not be shown during normal usage, that is right.



(In reply to Thomas Backlund from comment #38)
> The only time that message normally is shown is when something goes wrong so
> it would have switch from proprietary driver to free driver.
> (iirc there is a 60 sec delay Before it continues by itself)

You are probably thinking of a different dialog (the 'nokmsboot missing' dialog shown before X). The "Display driver switched" dialog has no timeout. But as you noted, it is not normally shown, and it "only" stops X startup from progressing.

> In your case I guess it gets in trouble because of missing "nokmsboot"
> parameter
> on kernel command line.

Actually, as noted above, it sees nvidia kernel module is missing and then tries to switch to nvidia304 (!). See opened bug #9234.



Thanks to everyone for the reports and comments.
Comment 45 Martin Spiegel 2013-03-02 02:34:37 CET
(In reply to Thomas Backlund from comment #43)
> 
> Ok, harddrake-15.24.1-1.mga3 with a fix for nvidia304 driver. 
> 
> Please install that, reconfigure your system to use the proprietary driver, 
> and reboot.
> 
> does the system now work ?

Yes!!!

1. switching from nvidia304 to nouveau using harddrake -> works
2. switching form nouveau to nvidia304 using harddrake -> works
3. no error message about replacing nvidia driver by nvidia driver when using 
   the proprietatry driver.

Thanks a lot!
Comment 46 Wilbert van Bakel 2013-03-02 03:48:54 CET
The steps that I took and my observations:

01. reinstall Mageia-3b2 (grub2)
02. update to 'recent'
03. reboot
04. verify harddrake 15.24.1 is installed
05. in MCC install proprietary driver
06> observe that 'Automatic start the graphical interface (Xorg) upon boot' is unselected in follow-up screen.
07. reboot
08> observe that nouveau driver gets loaded.
09> observe error message in text box "display driver issue, need nokmsboot"
10> observe error message "Sorry problem with graphic driver <...> Good luck!"
11. reboot
12> Observe that nokmsboot is not present in grub2 kernel line.
13. Continue with nokmsboot added.
12. System boots to Xorg with NVidia304 loaded as driver.
Comment 47 James Kerr 2013-03-02 10:45:35 CET
I can confirm comment#46. If Grub2 is in use, the nokmsboot parameter is not added automatically to the kernel command line. After it is added manually, then the proprietary driver is loaded and used.
Comment 48 Manuel Hiebel 2013-03-02 11:40:00 CET
We have bug 8540, for that. As this one is fixed, let's close, thanks everyone.

Status: NEW => RESOLVED
Resolution: (none) => FIXED
Source RPM: kernel-3.8.0-0.rc4.1.mga3.src.rpm => drakxtools

Comment 49 Manuel Hiebel 2013-03-03 12:13:54 CET
*** Bug 9216 has been marked as a duplicate of this bug. ***

CC: (none) => alejandro.anv


Note You need to log in before you can comment on or make changes to this bug.