Bug 1106 - Failure to start X with Nvidia 96xx drivers for Geforce 4 with Xorg 1.10 after upgrade from Mandriva 2010.2
Summary: Failure to start X with Nvidia 96xx drivers for Geforce 4 with Xorg 1.10 afte...
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: i586 Linux
Priority: High critical
Target Milestone: ---
Assignee: QA Team
QA Contact:
URL:
Whiteboard:
Keywords: validated_update
Depends on:
Blocks:
 
Reported: 2011-05-02 22:41 CEST by Matthieu Nguyen
Modified: 2014-05-08 18:06 CEST (History)
10 users (show)

See Also:
Source RPM: drakx-kbd-mouse-x11
CVE:
Status comment:


Attachments
make.log of the dkms rebuild failure. (10.90 KB, text/plain)
2011-05-02 22:48 CEST, Matthieu Nguyen
Details

Description Matthieu Nguyen 2011-05-02 22:41:17 CEST
Description of problem:

After upgrading from Mandriva 2010.2 to mageia 1 Cauldron Beta 2, with urpmi, the Display manager failed to start. dkms build failed. I tried uninstalling the nvidia package and installing the latest NVIDIA legacy 96xx driver from the web. It failed as well.

Version-Release number of selected component (if applicable):

x11-driver-video-nvidia96xx 96.43.18-2.mga1 i586
x11-server-xorg 1.10.1

How reproducible: Always


Steps to Reproduce:
1. Upgrade to Mageia Cauldron using a computer with a NVIDIA legacy GPU requiring the 96xx series driver
2. make sure xorg.conf is configured to use the "nvidia" driver
3. reboot
Comment 1 Matthieu Nguyen 2011-05-02 22:46:55 CEST
From searching through the web, it appears that xorg 1.10 introduced API changes that break compatibility with both the NVidia proprietary driver and the nouveau driver. For now, NVIDIA has not released any 96xx drivers compatible with xorg 1.10.
The only possible workarounds I found are:
- fall back to the "nv" driver (works with my Geforce 4 TI, but is known to *not* work with all cards)
- fall back to xorg 1.9

Unfortunately, I couldn't find any option of reverting back to an earlier version of xorg than the one provided by mageia.
Comment 2 Matthieu Nguyen 2011-05-02 22:48:52 CEST
Created attachment 343 [details]
make.log of the dkms rebuild failure.
Comment 3 Ahmad Samir 2011-05-03 01:07:43 CEST
You're right, the current nvidia96xx driver doesn't support X server 1.10, so no point making it build.

However, drakx11 (the graphical X server configuration tool) doesn't set nvidia96xx as the driver for the older nvidia cards by default (until such a time when upstream nvidia release a new driver).

*** This bug has been marked as a duplicate of bug 746 ***

Status: NEW => RESOLVED
Resolution: (none) => DUPLICATE

Comment 4 Matthieu Nguyen 2011-05-03 08:07:02 CEST
(In reply to comment #3)
> You're right, the current nvidia96xx driver doesn't support X server 1.10, so
> no point making it build.
> 
> However, drakx11 (the graphical X server configuration tool) doesn't set
> nvidia96xx as the driver for the older nvidia cards by default (until such a
> time when upstream nvidia release a new driver).
> 
> *** This bug has been marked as a duplicate of bug 746 ***

The difference with bug 746 is that I wasn't trying to *install* nvidia96xx after upgrading. I was already using it, and it was fine. Upgrading to mageia 1, through urpmi (and not the DVD), basically *broke* the DM, since X reconfiguration is not triggered after an urpmi upgrade, and no incompatibility has been written between the nvidia96xx RPM and the xorg 1.10 RPM.

I think this should be either mentioned in the Errata, or a version incompatibility shown in the RPMs. Maybe making it that to update xorg to 1.10, nvidia96xx *must* be removed as a collateral? This way users know that they need to rerun XFdrake and pick up another driver. (Removing nvidia96xx from the mirrors wouldn't help, since in case of an upgrade, all it would do would be to leave the old mdv package installed.)
Comment 5 Ahmad Samir 2011-05-03 18:35:50 CEST
I meant to ask Anssi about why the driver wasn't switched automatically; it's supposed to be switched...

Better reopen the bug so as not to forget the issue..

Priority: Normal => High
Status: RESOLVED => REOPENED
Component: Installation => RPM Packages
Resolution: DUPLICATE => (none)
Source RPM: x11-driver-video-nvidia96xx => drakx-kbd-mouse-x11

Ahmad Samir 2011-05-03 18:36:00 CEST

CC: (none) => thierry.vignaud

Ahmad Samir 2011-05-17 21:25:04 CEST

CC: (none) => anssi.hannula

Comment 6 Richard Neill 2011-07-03 21:15:24 CEST
I've just tried much the same thing, upgrading to Mageia 1. 

Aside: with the nouveau driver, the system kernel paniced (will file separate bug).

Then I followed the XFDrake process, and enabled the proprietary driver. The following packages were installed:

dkms-nvidia-current-275.09.07-0.1.mga1.nonfree
nvidia-current-doc-html-275.09.07-0.1.mga1.nonfree
x11-driver-video-nvidia-current-275.09.07-0.1.mga1.nonfree

and the installation seemed to work ok, including the dkms compilation.

HOWEVER, X wouldn't start. On further investigation, it turned out that
 modprobe nvidia-current refused, saying that the module was invalid (sorry, I've lost the detailed error message). The file:
  /lib/modules/2.6.38.7-desktop-1.mga/dkms/drivers/char/drm/nvidia-current.ko.gz
was present, but wouldn't load.

Finally, I had to resort to installing the driver from Nvidia's homepage, by downloading NVIDIA-Linux-x86_64-275.09.07.run and running it. This works.

[My system has a 7600 GT card on a 64-bit machine]

CC: (none) => mageia

Maurice Batey 2011-07-04 21:39:05 CEST

CC: (none) => maurice

Comment 7 Matthieu Nguyen 2011-08-17 17:06:32 CEST
96.43.20 beta packages are available with support for the xorg 1.10. I tested the manual install.

So maybe it would be possible to just update the RPMs in the mageia repository to bring them to 96.43.20 level, and the problem would be solved :)

Hope this helps.
Florian Hubold 2011-09-15 08:41:01 CEST

CC: (none) => doktor5000
Assignee: bugsquad => anssi.hannula

Comment 8 Florian Hubold 2011-09-15 08:46:37 CEST
The legacy drivers are now available in version 100.14.11.
Comment 9 Florian Hubold 2011-09-16 00:50:00 CEST
Sorry, scratch my last comments, latest _stable_ legacy drivers are now at 96.43.20 as #c7 said. Will try to update those to fix upgrade problems.
Comment 10 Jay Bowles 2011-09-26 23:11:42 CEST
Is this bug being actively looked at? I only ask as, having to reinstall Mageia, I have encountered this problem also. Strangely I had Nvidia's driver working on my previous install except for some irritating screen tearing whilst playing kpatience (embarrassingly my main reason for reinstalling!).

I notice with some interest that Mandriva seem to be having a similar issue.

Regards Jay

CC: (none) => king.of.random

Comment 11 Matthieu Nguyen 2011-09-27 00:40:12 CEST
(In reply to comment #10)
> Is this bug being actively looked at? I only ask as, having to reinstall
> Mageia, I have encountered this problem also. Strangely I had Nvidia's driver
> working on my previous install except for some irritating screen tearing whilst
> playing kpatience (embarrassingly my main reason for reinstalling!).
> 
> I notice with some interest that Mandriva seem to be having a similar issue.
> 
> Regards Jay

Comment #9 seems to indicate so :). However, I just checked the "official" mirrors (not the cooker or testing), and it is not there yet.

What I did on my desktop suffering from the problem was uninstall the nvidia RPM, get the 96.43.20 driver directly from NVIDIA, and install it manually.

Personnally, I'll uninstall and replace with the RPM once it is available.
Comment 12 Florian Hubold 2011-09-27 06:36:00 CEST
Sorry, only fixed for cauldron so far. And yes this is being actively looked at.
Will try to push this to updates_testing for Mageia 1 in the next few days.
Comment 13 Jay Bowles 2011-09-27 23:31:56 CEST
Excellent news. I look forward to receiving the fix.

Thanks for your good work.

Regards Jay.
Comment 14 Anssi Hannula 2011-10-28 16:03:32 CEST
Sorry, I thought this had already been processed.

Pushed nvidia-96xx 96.43.20 to nonfree/updates_testing of mga1.

Advisory:
=======================
Mageia 1 release contained a non-working nvidia-96xx driver package that was not used by default.

This update updates the driver to a newer version which works correctly, preventing issues for those upgrading from a previous Mandriva release where this driver variant was used, and for those using this driver manually.
=======================

Test cases:
- Install dkms-nvidia96xx. With the old release, it doesn't build, with the new one, it does.
- Install x11-driver-video-nvidia96xx and set the system to use it by following README.manual-setup instructions. With the old release, X.org server fails to start, with the new one it does.

Assignee: anssi.hannula => qa-bugs

Comment 15 Florian Hubold 2011-10-28 16:32:12 CEST
Sorry, also forgot to ask to push this, too.

Well, now that this has been pushed, won't we also need to backport the changes in ldetect-lst and also push ldetect-lst? Thierry said first we'd need the updated drivers and then ldetect-lst can be pushed.

@Anssi: Shall i merge the update for nvidia173 in updates/1 branch?
Comment 16 claire robinson 2011-10-28 16:38:21 CEST
I'm not able to test this. I do have an old g-force card somewhere I think but no longer have an AGP slot to fit it in.

$ rpm -qa | grep nvidia
dkms-nvidia-current-275.09.07-0.1.mga1.nonfree
nvidia-cuda-toolkit-3.2.16-1.mga1
x11-driver-video-nvidia-current-275.09.07-0.1.mga1.nonfree
nvidia-current-doc-html-275.09.07-0.1.mga1.nonfree
nvidia-current-cuda-opencl-275.09.07-0.1.mga1.nonfree

Jay are you able to test this please? If so, can you please let us know if you are using x86_64 or i586. Thanks!
Comment 17 claire robinson 2011-10-28 16:40:19 CEST
Florian, ldetect-lst-0.1.291-9.1.mga1.src.rpm is already in testing.
Comment 18 Florian Hubold 2011-10-28 16:52:40 CEST
@claire: Yes i know but it does not contain the needed fix to reenable the 96xx driver for the older GeForce 2 MX to GeForce 4 cards.
Comment 19 Anssi Hannula 2011-10-28 16:55:43 CEST
(In reply to comment #15)
> @Anssi: Shall i merge the update for nvidia173 in updates/1 branch?

Yes, for the KDE issue.

BTW, for testers' information, nvidia-96xx supports cards roughly from GF2 MX to GF 7900.
Comment 20 Anssi Hannula 2011-10-28 17:02:50 CEST
(In reply to comment #18)
> @claire: Yes i know but it does not contain the needed fix to reenable the 96xx
> driver for the older GeForce 2 MX to GeForce 4 cards.

Unfortunately that would cause harddrake to change driver automatically for existing installations, which is not something I'd like an update to do... Unless other people experienced in this area (tv, blino?) think otherwise.
Comment 21 claire robinson 2011-10-30 20:12:17 CET
Mattheiu do you still have the hardware to test this please?
Comment 22 Matthieu Nguyen 2011-10-30 21:39:33 CET
(In reply to comment #21)
> Mattheiu do you still have the hardware to test this please?

Sure, I'd be glad to help. You only need me to install the package, and confirm that X starts with the nvidia driver (after manual xorg.conf edit, of course), right?
Comment 23 José Jorge 2011-10-31 07:55:07 CET
(In reply to comment #22)
> Sure, I'd be glad to help. You only need me to install the package, and confirm
> that X starts with the nvidia driver (after manual xorg.conf edit, of course),
> right?

In fact, the good test would be not using manual edit of files, using MCC :
- activate updates_testing repos, apply all updates, ensure the nouveau driver is still used after a reboot
- install nvidia-96xx, which should require dkms-nvidia-96xx
- reboot, ensuring again the nouveau driver is still used
- configure in MCC, say yes to proprietary driver
- after reboot, ensure nouveau is no more used, and that no libGL problem is there (launch a 3D app)

CC: (none) => lists.jjorge

Comment 24 Florian Hubold 2011-10-31 14:53:45 CET
The instructions on testing it in https://bugs.mageia.org/show_bug.cgi?id=1106#c14 (see "Test cases" at the bottom) are not sufficient?
Comment 25 Matthieu Nguyen 2011-10-31 20:50:36 CET
First thing: Nouveau driver does not work with my Geforce 4 Ti 4200. X not starting. I have to either:
use nv
use the proprietary driver installed on the box manually from the nvidia website.

So I can test, but not the "nouveau" piece.

Then, I don't see the 96.43.20 package, and am up to date with the package sources and enabled nonfree/updates testing (Removed and readded from the MCC, to make sure I got the latest hdlists)
Comment 26 claire robinson 2011-11-03 15:32:00 CET
Matthieu when you enabled updates_testing did you enable it as an update option or only to be able to install from it.

Try with.. drakrpm-edit-media --expert

That will allow you to tick the Updates column.


Remember to disable it when you have finished.
Comment 27 Anssi Hannula 2011-11-03 15:48:10 CET
Actually, there was a buildsystem issue and the updated package wasn't available until a few days ago when I resubmitted it (I forgot to mention it here, sorry).
Comment 28 José Jorge 2011-11-03 17:29:33 CET
(In reply to comment #20)
> (In reply to comment #18)
> > @claire: Yes i know but it does not contain the needed fix to reenable the 96xx
> > driver for the older GeForce 2 MX to GeForce 4 cards.
> 
> Unfortunately that would cause harddrake to change driver automatically for
> existing installations, which is not something I'd like an update to do...
> Unless other people experienced in this area (tv, blino?) think otherwise.

May we have a fix to allow selecting the proprietary driver in XFdrake without automaticaly defaulting to it? Because manually setting the driver is really not a elegant solution...
Comment 29 Anssi Hannula 2011-11-03 17:57:49 CET
(In reply to comment #28)

It is not doable without hacks.

@Thierry, do you think a hack in harddrake to prevent autoreconfiguration on this specific case would be reasonable, or can you think of a better solution to this?
Comment 30 Florian Hubold 2011-11-11 21:02:35 CET
Matthieu, have you been able to test this yet?
Comment 31 José Jorge 2011-11-12 21:28:47 CET
I have a Geforce 4MX, so I could test, manually configuring:

-I had to reboot twice to get a nokmsboot kernel parameter inserted automatically in grub by drakxtools.

-I had to add the 'nopat' kernel parameter to fix this bug :
X:2198 conflicting memory types e4000000-e4570000 uncached-minus<->write-combining
reserve_memtype failed 0xe4000000-0xe4570000, track write-combining, req write-combining

(see https://bugzilla.redhat.com/show_bug.cgi?id=484682 for more about that)

Then it works : supertuxkart runs at the same speed in FPS than nouveau-ancien dri driver, but it does not freeze.

So I think this update can be pushed for the brave hearts ;-)
Comment 32 Matthieu Nguyen 2011-11-13 07:55:01 CET
Sorry about the delay. I was away from the desktop for a week, and couldn't validate the bugfix.

1) The RPM shows up in the MCC when I look for it, with the right version \o/
2) Still no luck. X not starting.

Here's what I did:

1) update all from the MCC
2) remove orphans
3) switch the driver back to "nv"
4) manually uninstall the NVIDIA driver (installed from the binary as a workaround for now)
5) reboot
6) install dkms-nvidia96xx with dependencies
7) manually edit xorg.conf to use "nvidia" instead of "nv"

The RPMs are seen as installed. dkms status lists the module as properly registered for this kernel.

However, X fails to start and complains about "Failed to load module "nvidia" (module does not exist, 0)" in the /var/log/Xorg.0.log

Let me know if you need more logs, and where I can find them.
Comment 33 Matthieu Nguyen 2011-11-13 12:02:33 CET
ok, additional piece of information. I think I figured where the current issue sits.

1) Driver installs and is working.
2) modified libglx and Xorg driver are *not* installed/linked properly.

RPM installs the libglx.so and nvidia_drv.so under /usr/lib/nvidia96xx/xorg

Xorg expects the nvidia libglx.so to be under /usr/lib/xorg/modules/extensions and the nvidia_drv.so to be under /usr/lib/xorg/modules/drivers/

I did not check the post-install scripts, but I think they should copy /usr/lib/nvidia96xx/xorg/nvidia_drv.so under /usr/lib/xorg/modules/drivers, and /usr/lib/nvidia96xx/xorg/libglx.so.96.43.20 + /usr/lib/nvidia96xx/xorg/libglx.so (symlink to the 96.43.20) to /usr/lib/xorg/modules/extensions

By manually doing so, I got X to start with the nvidia driver :)
Comment 34 José Jorge 2011-11-13 16:02:08 CET
(In reply to comment #33)
> 2) modified libglx and Xorg driver are *not* installed/linked properly.
> 

I think you missed comment 14 which talks about "following
README.manual-setup instructions" : you have to launch some commands for this to happen. I followed them, and it works, see comment 31.
Comment 35 Matthieu Nguyen 2011-11-13 16:50:48 CET
(In reply to comment #34)
> (In reply to comment #33)
> > 2) modified libglx and Xorg driver are *not* installed/linked properly.
> > 
> 
> I think you missed comment 14 which talks about "following
> README.manual-setup instructions" : you have to launch some commands for this
> to happen. I followed them, and it works, see comment 31.

Yes, it looks like I missed them... :-S

So everything looks fine now :)
Comment 36 Dave Hodgins 2011-11-21 01:57:41 CET
The srpm for the Nonfree Release version is
nvidia-96xx-96.43.18-2.mga1.src.rpm

The srpm for the Nonfree Updates Testing version is
nvidia-96xx-96.43.20-1.1.mga1.nonfree.src.rpm

Will the difference in names cause a problem when pushing the update?

CC: (none) => davidwhodgins

Comment 37 Florian Hubold 2011-12-01 02:17:25 CET
Shouldn't this be assigned to sysadmins now that it is validated?
Comment 38 Manuel Hiebel 2011-12-01 02:31:43 CET
where do you see it's validated https://bugs.mageia.org/show_bug.cgi?id=1106#keywords ? :)
Comment 39 Dave Hodgins 2011-12-01 02:35:42 CET
I'd like an answer to commment 36 first whether or no renaming
the src rpm from src.rpm to nonfree.src.rpm will cause a problem
that the sysadmin team will have to handle manually, when the
update is pushed.
Comment 40 claire robinson 2011-12-08 16:39:17 CET
Adding sysadmin into CC - Could you please give a response to Dave's query.

Thankyou

CC: (none) => sysadmin-bugs

Comment 41 Nicolas Vigier 2011-12-08 16:45:58 CET
(In reply to comment #39)
> I'd like an answer to commment 36 first whether or no renaming
> the src rpm from src.rpm to nonfree.src.rpm will cause a problem
> that the sysadmin team will have to handle manually, when the
> update is pushed.

It shouldn't be a problem.

CC: (none) => boklm

Comment 42 Dave Hodgins 2011-12-11 08:21:48 CET
Despite my request for testers posted to the mageia-discuss mailing
list, there has been no additional feedback, so either people are
not interested, or very few people have this hardware.

As Comment 35 indicates it's working, I'm inclined to validate the update,
but will wait a day for responses to this messagee before doing so.
Comment 43 Dave Hodgins 2011-12-21 01:24:10 CET
Validating the update.

Could someone from the sysadmin team push the srpm
nvidia-96xx-96.43.20-1.1.mga1.nonfree.src.rpm
from Nonfree Updates Testing to Nonfree Updates.

Advisory:

Mageia 1 release contained a non-working nvidia-96xx driver package that was
not used by default.

This update updates the driver to a newer version which works correctly,
preventing issues for those upgrading from a previous Mandriva release where
this driver variant was used, and for those using this driver manually.

Keywords: (none) => validated_update

Comment 44 Thomas Backlund 2011-12-21 19:18:55 CET
Update pushed.

Status: REOPENED => RESOLVED
CC: (none) => tmb
Resolution: (none) => FIXED

Nicolas Vigier 2014-05-08 18:06:37 CEST

CC: boklm => (none)


Note You need to log in before you can comment on or make changes to this bug.