Bug 20130 - On clean installation, nvidia driver will not load (Older version of kernel devel files needed than match the installed kernel)
Summary: On clean installation, nvidia driver will not load (Older version of kernel d...
Status: RESOLVED DUPLICATE of bug 17400
Alias: None
Product: Mageia
Classification: Unclassified
Component: Release (media or process) (show other bugs)
Version: Cauldron
Hardware: x86_64 Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-16 02:17 CET by Doug Laidlaw
Modified: 2017-01-17 14:10 CET (History)
3 users (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments
report.bug.xz copied from /root/drakx. (318.32 KB, application/x-xz)
2017-01-16 02:17 CET, Doug Laidlaw
Details
Journal of first boot. (34.68 KB, application/x-xz)
2017-01-16 02:19 CET, Doug Laidlaw
Details

Description Doug Laidlaw 2017-01-16 02:17:38 CET
Created attachment 8859 [details]
report.bug.xz copied from /root/drakx.

[Lodged at Marja's request]
[May be a duplicate of Bug 18084]

As at Jan 1, I has Mageia 5.1 as my working system.  I decided to install Cauldron to a spare partition.  I downloaded the current ânetinstallâ ISO and burned that to a Flash drive with IsoDumper.  I selected Xfce and MATE as my desktops, but NOT Kde.  The mirror sources were added by the installer, and by its nature, the installation was fully up-to-date.

The installation was straightforward.  I accepted the suggestion of the nVidia driver.

On first reboot, the normal Cauldron splash screen appeared, but X did not start.  I was dropped to the âGood Luckâ prompt.  ânokmsbootâ was in the kernel line in Cauldron.  Xorg.0.log reported that the nVidia kernel could not be found.  No separate RPM for the nVidia kernel seemed to exist.  There was no ~/.xinitrc, and I reasoned that maybe the installer was looking for KDE, and not finding it, so I created one, pointing to Xfce.

I then reinstalled the Grub2 bootloader for Official, and worked from there. The stanzas for Cauldron had NO options for the kernel, just the kernel-version. As a result, ânokmsbootâ was not passed to the Cauldron bootloader.  Editing the Grub2 line showed that nokmsboot was missing, and there was no VGA= option.  I added nokmsboot, and got a graphical login.  As updated, Grub2's line for Cauldron is correct.

Since proprietary drivers had been mentioned as the culprits, yesterday I changed the video driver to nouveau as a test.  I was back at the âGood Luckâ prompt.  I restored the proprietary driver RPM, and had my desktop back.

- Architecture [also part of the ISO name below]:
  - x86_64
- What graphics hardware?
  - GeForce GTX-750 1024 MB
- Which ISO are you testing (title & date)?
  - Installed from Netinstall âSun Jan  1 08:31:19 2017â  with Xfce and MATE.  Deselected KDE.
- BIOS boot or UEFI?
  - UEFI
- At what stage do you see the message?
    - Re-booting the installed system:
    - With non-free [video] software
    - Without installation update
- Kernel version(s)
  - 4.9.0-desktop-3.mga6
- In the case of nvidia does the kmod build succeed?
  - What's that?
- What graphics driver?
  - Mageia default RPM for NVIDIA GeForce 420 series and later
- If you added a kernel parameter to succeed: what?
  - nokmsboot
- Which display manager?
  - lightdm
  * Please try others (if you have >1, from a console: # drakdm)
  * If you have only 1, install another if you are able to (have network, repos enabled) # urpmi GDM|LightDM|LXDM|SDDM
  - A bit late now, after everything is working.  As an experiment, I switched to nouveau driver.  That brought back the âGood Luckâ prompt!  Restoring the NVIDIA driver brought back my desktop.  Blacklisting nouveau not required.
  * Re-boot
- The presence and contents of /etc/sddm.conf
  - KDE not installed.
- The presence and contents of /etc/sysconfig/desktop
  - âDISPLAYMANAGER=lightdmâ
- /etc/X11/xorg.conf: does removing it change the situation?
  - Yes, killed graphics.
- Whether Autologin is active or not.
  - No.
Comment 1 Doug Laidlaw 2017-01-16 02:19:08 CET
Created attachment 8860 [details]
Journal of first boot.
Comment 2 Marja Van Waes 2017-01-16 12:08:10 CET
(In reply to Doug Laidlaw from comment #0)
> Created attachment 8859 [details]
> report.bug.xz copied from /root/drakx.
> 
> [Lodged at Marja's request]

Indeed :-)

> [May be a duplicate of Bug 18084]
> 

I don't know whether this is relevant at all (from report.bug.xz):

Error! Your kernel devel files for kernel 4.9.0-desktop-3.mga6 cannot be found at
/lib/modules/4.9.0-desktop-3.mga6/build or /lib/modules/4.9.0-desktop-3.mga6/source.



Anyway, from the first time booting into the new install:

Jan 11 20:54:20 cauldron.douglaidlaw.net dkms-autorebuild.sh[808]: nvidia-current (375.26-1.mga6.nonfree): Installing module.
Jan 11 20:54:20 cauldron.douglaidlaw.net dkms-autorebuild.sh[808]: dkms build -m nvidia-current -v 375.26-1.mga6.nonfree -k 4.9.2-desktop-1.mga6 -a x86_64 -q --no-clean-kernel

 but then:

Jan 11 20:54:57 cauldron.douglaidlaw.net service_harddrake[8449]: switch X.org driver from 'nv.+' to 'nouveau' (The proprietary kernel driver was not found for X.org driver 'nvidia')





> 
> Since proprietary drivers had been mentioned as the culprits, yesterday I
> changed the video driver to nouveau as a test. 

With or without "nokmsboot" in the kernel options in grub2?

> I was back at the âGood
> Luckâ prompt.  I restored the proprietary driver RPM, and had my desktop
> back.
>

CC: (none) => marja11
Assignee: bugsquad => kernel

Comment 3 Marja Van Waes 2017-01-16 12:11:23 CET
Sorry, forget my comment, I was too much in a hurry to let what I read in the logs sink in :-(((
Comment 4 Doug Laidlaw 2017-01-16 14:30:56 CET
O.K.: But the installer usually asks for and installs the kernel-devel files, along with gcc and the usual files needed for compiling from source.

The entry about switching to nouveau:  that is normal IF the required files are missing.

"nokmsboot" was always in Cauldron's grub2.  At that time, it was missing in the string sent from Official, but it is there now.  Sounds like a botched installation, somehow.  A previous netinstall was a complete mess.  Need to test from the sta2 DVD.
Comment 5 Marja Van Waes 2017-01-16 21:49:03 CET
(In reply to Marja van Waes from comment #2)

> 
> 
> Anyway, from the first time booting into the new install:
> 
> Jan 11 20:54:20 cauldron.douglaidlaw.net dkms-autorebuild.sh[808]:
> nvidia-current (375.26-1.mga6.nonfree): Installing module.
> Jan 11 20:54:20 cauldron.douglaidlaw.net dkms-autorebuild.sh[808]: dkms
> build -m nvidia-current -v 375.26-1.mga6.nonfree -k 4.9.2-desktop-1.mga6 -a
> x86_64 -q --no-clean-kernel
> 
>  but then:
> 
> Jan 11 20:54:57 cauldron.douglaidlaw.net service_harddrake[8449]: switch
> X.org driver from 'nv.+' to 'nouveau' (The proprietary kernel driver was not
> found for X.org driver 'nvidia')
> 
> 

This is OK, after all. I had assumed, instead of checked that the above lines were before. 

Jan 11 20:57:29 cauldron.douglaidlaw.net systemd[1]: prefdm.service: Main process exited, code=exited, status=1/FAILURE
Jan 11 20:57:29 cauldron.douglaidlaw.net systemd[1]: prefdm.service: Unit entered failed state.

But they were.


(In reply to Doug Laidlaw from comment #4)
> O.K.: But the installer usually asks for and installs the kernel-devel
> files, along with gcc and the usual files needed for compiling from source.

Yeah, but according to report.bug.xz, a newer kernel-desktop-devel version was installed than the version of the kernel devel files that couldn't be found:

> installing kernel-desktop-devel-latest-4.9.2-1.mga6.x86_64.rpm 

> Error! Your kernel devel files for kernel 4.9.0-desktop-3.mga6 cannot be
> found at
> /lib/modules/4.9.0-desktop-3.mga6/build or
> /lib/modules/4.9.0-desktop-3.mga6/source.

The lspci output at the beginning of report.bug.xz shows things like

  hub             : Linux 4.9.0-desktop-3.mga6

So 4.9.0-3 seems to be stage2's kernel version.

I don't understand why kernel devel files for that version are needed, instead of for v. 4.9.2-1

CC: sysadmin-bugs => isobuild
Summary: On clean installation, nvidia driver will not load => On clean installation, nvidia driver will not load (Older version of kernel devel files needed than match the installed kernel)

Comment 6 Thomas Backlund 2017-01-16 22:12:33 CET
(In reply to Marja van Waes from comment #5)

> > Error! Your kernel devel files for kernel 4.9.0-desktop-3.mga6 cannot be
> > found at
> > /lib/modules/4.9.0-desktop-3.mga6/build or
> > /lib/modules/4.9.0-desktop-3.mga6/source.
> 
> The lspci output at the beginning of report.bug.xz shows things like
> 
>   hub             : Linux 4.9.0-desktop-3.mga6
> 
> So 4.9.0-3 seems to be stage2's kernel version.
> 

Not stage2, its drakx-installer-images that carries the kernel

> I don't understand why kernel devel files for that version are needed,
> instead of for v. 4.9.2-1

Some mixed / missing rpms on CI

CC: (none) => tmb

Comment 7 Doug Laidlaw 2017-01-17 04:46:21 CET
Does that mean that it would happen only on a netinstall?
Comment 8 Marja Van Waes 2017-01-17 10:19:52 CET
(In reply to Thomas Backlund from comment #6)

> 
> Some mixed / missing rpms on CI

(In reply to Doug Laidlaw from comment #7)
> Does that mean that it would happen only on a netinstall?

With a Classical Iso it can happen, as Thomas said, when there are mixed or missing rpms on it. When they're mixed it means dependencies can't be resolved because a needed package has the wrong version.

With a netinstall it can happen if your mirror has a problem, or if new kernel packages just were pushed and your mirror has only part of them. 

Much of what I'm about to say now, is known to you, but that's for users who know less and will read this report.

On January 10 kernel 4.9.2-1 was pushed, it's in the list here:

https://ml.mageia.org/l/arc/changelog/2017-01/thrd14.html (page back after confirming you're not a spammer)

You can see that drakx-installer-images-2.41-2.mga6 is near the bottom of that list. That's the version that was rebuilt with the new 4.9.2-1 kernel, and was pushed to the mirrors 42 minutes later.
drakx-installer-images-2.41-2.mga6.nonfree was pushed another 19 minutes later. kmod-* came in between.

And even RPMs stemming from the same SRPM can land on a mirror at different times :-/

Anyway, no kernels were pushed since 1½ days ago, it should be safe to try again :-)

http://pkgsubmit.mageia.org/ will show the last packages (actually the SRPM names are given) that were built (and thus pushed to the mirrors)
http://mirrors.mageia.org/status will give an indication about the status of your mirror.
Comment 9 Doug Laidlaw 2017-01-17 11:18:12 CET
OK. I will try again.  The Mirror Map off mirrors.mageia.org shows that Australia and Indonesia are outside the network.  Most of the mirrors are a few hours behind, but mirror.aarnet.edu.au hasn't been updated since the 9th, and is listed as Broken.  That isn't the first time it has been so far behind.  There are a few that are up-to-date and fast.
Comment 10 Doug Laidlaw 2017-01-17 13:25:23 CET
Just complreted a clean reinstall.  Booted straight into Xfce desktop with no problems.
Comment 11 Marja Van Waes 2017-01-17 13:58:18 CET
(In reply to Doug Laidlaw from comment #10)
> Just complreted a clean reinstall.  Booted straight into Xfce desktop with
> no problems.

Thanks for checking :)


IIUC, mirrorbrain ( see bug 17400 ) should help prevent problems with outdated mirrors or mirrors in the middle of syncing interdependant packages.

Other than that, there's nothing we can do.

This report can be closed, but I'm in doubt about the correct resolution, if mirrorbrain does indeed prevent this, then closing as duplicate of 17400 seems best.
Comment 12 Doug Laidlaw 2017-01-17 14:08:04 CET
Wilco.  Don't assume that I know as much as you do.  I am a lawyer by
training, came to computers in about 1994, with XT laptops.  I have learned entirely by doing.
Comment 13 Doug Laidlaw 2017-01-17 14:10:16 CET
No further action required here. Closed as per Comment 11.

*** This bug has been marked as a duplicate of bug 17400 ***

Status: NEW => RESOLVED
Resolution: (none) => DUPLICATE


Note You need to log in before you can comment on or make changes to this bug.