| Summary: | Nvidia drivers x11-driver-video-nvidia-current-455.55-1.mga8.nonfree, bad udev rules at boot | ||
|---|---|---|---|
| Product: | Mageia | Reporter: | Aurelien Oudelet <ouaurelien> |
| Component: | RPM Packages | Assignee: | Kernel and Drivers maintainers <kernel> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | Normal | CC: | ghibomgx |
| Version: | Cauldron | ||
| Target Milestone: | Mageia 8 | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Source RPM: | x11-driver-video-nvidia-current-455.55-1.mga8.nonfree | CVE: | |
| Status comment: | |||
| Attachments: | packages installed or updated since january 4th 2021. | ||
Which version of systemd are you using? 246.9-2 or the retired 247-1 that spotted for a while in updates_testing? @Giuseppe: systemd 246.9-2 Can you track down in journal to some older systemd upgrade? Furthermore does the same occur with nvidia 460.27.04 in nonfree/updates_testing? 460.27.04 was a beta driver, that's why has not yet been pushed to nonfree/release, but it's pretty close to the next upcoming 460.xx. Updates_testing has been cleaned right now, but here you can still find it: http://ftp.belnet.be/mirror/mageia/distrib/cauldron/x86_64/media/nonfree/updates_testing/ (In reply to Giuseppe Ghibò from comment #3) > Can you track down in journal to some older systemd upgrade? > > Furthermore does the same occur with nvidia 460.27.04 in > nonfree/updates_testing? 460.27.04 was a beta driver, that's why has not yet > been pushed to nonfree/release, but it's pretty close to the next upcoming > 460.xx. Updates_testing has been cleaned right now, but here you can still > find it: > > http://ftp.belnet.be/mirror/mageia/distrib/cauldron/x86_64/media/nonfree/ > updates_testing/ Yeah you're right: rpm -qa --last | grep systemd systemd-246.9-2.mga8.x86_64 lun. 04 janv. 2021 10:18:17 lib64systemd0-246.9-2.mga8.x86_64 lun. 04 janv. 2021 10:18:10 So this is related to him. Udev errors appear right after applying this systemd update. [RPM][9742]: erase dkms-nvidia-current-455.45.01-1.mga8.nonfree.x86_64: success janv. 07 13:03:57 mageia.local [RPM][9742]: erase nvidia-current-doc-html-455.45.01-1.mga8.nonfree.x86_64: success janv. 07 13:03:57 mageia.local [RPM][9742]: erase nvidia-current-utils-455.45.01-1.mga8.nonfree.x86_64: success janv. 07 13:03:57 mageia.local [RPM][9742]: install x11-driver-video-nvidia-current-460.27.04-1.mga8.nonfree.x86_64: success janv. 07 13:03:58 mageia.local systemd[1]: Started /usr/bin/systemctl start man-db-cache-update. janv. 07 13:03:58 mageia.local systemd[1]: Starting man-db-cache-update.service... janv. 07 13:03:58 mageia.local [RPM][9742]: install nvidia-current-utils-460.27.04-1.mga8.nonfree.x86_64: success janv. 07 13:03:58 mageia.local systemd[1]: Started /usr/bin/systemctl start man-db-cache-update. janv. 07 13:03:58 mageia.local [RPM][9742]: install dkms-nvidia-current-460.27.04-1.mga8.nonfree.x86_64: success janv. 07 13:03:58 mageia.local [RPM][9742]: install nvidia-current-doc-html-460.27.04-1.mga8.nonfree.x86_64: success janv. 07 13:03:58 mageia.local [RPM][9742]: install nvidia-current-cuda-opencl-460.27.04-1.mga8.nonfree.x86_64: success janv. 07 13:03:58 mageia.local [RPM][9742]: install x11-driver-video-nvidia-current-460.27.04-1.mga8.nonfree.x86_64: success janv. 07 13:03:58 mageia.local [RPM][9742]: Transaction ID 5ff6f810 finished: 0 Reboot Still a udev line error with 460.27.04
$ journalctl -b | grep nvidia
janv. 07 13:05:54 mageia.local kernel: Command line: BOOT_IMAGE=/vmlinuz-5.10.5-desktop-1.mga8 root=UUID=7c97e985-1ddb-4058-a85d-611fdaa4e144 ro nouveau.modeset=0 nvidia.modeset=1 noiswmd resume=UUID=235f257b-6f19-4537-af65-4655fb448221 audit=0
janv. 07 13:05:54 mageia.local kernel: nvidia-gpu 0000:01:00.3: enabling device (0000 -> 0002)
janv. 07 13:05:55 mageia.local kernel: nvidia: loading out-of-tree module taints kernel.
janv. 07 13:05:55 mageia.local kernel: nvidia: module license 'NVIDIA' taints kernel.
janv. 07 13:05:55 mageia.local kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 242
janv. 07 13:05:55 mageia.local kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
janv. 07 13:05:55 mageia.local systemd-udevd[562]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \ -f 4); do /usr/bin/test -c /dev/nvidia${i} || /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c 195 ${i}; done'' failed with exit code 1.
janv. 07 13:05:56 mageia.local kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 460.27.04 Fri Dec 11 23:24:19 UTC 2020
janv. 07 13:05:57 mageia.local kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
janv. 07 13:05:57 mageia.local kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0
janv. 07 13:05:57 mageia.local kernel: nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint.
janv. 07 13:05:57 mageia.local kernel: nvidia-uvm: Loaded the UVM driver, major device number 239.
janv. 07 13:06:06 mageia.local dkms-autorebuild.sh[819]: nvidia-current (460.27.04-1.mga8.nonfree): Already installed on this kernel.
so what part of the for loop:
'/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \ -f 4); do /usr/bin/test -c /dev/nvidia${i} || /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c 195 ${i}; done'' failed with exit code 1.
is failing ?
Someone with nvidia hw need to debug this.
I've gotten rid of all my nVidia hw as I dont want to have to rely on the proprietary drivers anymore now that there is a good Amd alternative out with open source support even before hardware launch day :)
What is changed between systemd-246-1.mga8 and systemd-246-2.mga8? Has it introduced some restrictions about some character usage? E.g. $ ' \ ; (In reply to Aurelien Oudelet from comment #4) > > Yeah you're right: > rpm -qa --last | grep systemd > systemd-246.9-2.mga8.x86_64 lun. 04 janv. 2021 10:18:17 > lib64systemd0-246.9-2.mga8.x86_64 lun. 04 janv. 2021 10:18:10 > > So this is related to him. > > Udev errors appear right after applying this systemd update. I think this is a red herring. If you mean it showed up in logs directly after installing this, then that's because systemd respawns its daemons on update, so udev would re-trigger the rules... The only change betwteen -1.mga8 and 2.mga8 is temporary removal of log warning: "SysV service '%s' lacks a native systemd unit file.", to avoid needless bugreports, as we wont fix all of them before Mageia 8 is released (In reply to Aurelien Oudelet from comment #0) > Created attachment 12190 [details] > packages installed or updated since january 4th 2021. > > Bad udev warning since Kernel 5.10.4-4.mga8 (also with kernel 5.10.5-1.mga8) > Can you confirm this ? that it started with 5.10.4-4 and booting to 5.10.4-3 silences the warning ? Will do test tonight, after work. And if 5.10.4-3 works, does 5.10.5-1.1 from: https://tmb.nu/Mageia/Cauldron/bugs/28035/ work too ? Booting with 5.10.4-3.mga8: no udev warnings. janv. 07 21:36:41 mageia.local kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-5.10.4-desktop-3.mga8 root=UUID=7c97e985-1ddb-4058-a85d-611fdaa4e144 ro nouveau.modeset=0 nvidia.modeset=1 splash quiet noiswmd resume=UUID=235f257b-6f19-4537-af65-4655fb448221 audit=0 vga=791 janv. 07 21:36:41 mageia.local kernel: nvidia-gpu 0000:01:00.3: enabling device (0000 -> 0002) janv. 07 21:36:45 mageia.local dkms-autorebuild.sh[779]: nvidia-current (460.27.04-1.mga8.nonfree): Installing module. janv. 07 21:36:45 mageia.local dkms-autorebuild.sh[779]: dkms build -m nvidia-current -v 460.27.04-1.mga8.nonfree -k 5.10.4-desktop-3.mga8 -a x86_64 -q --no-clean-kernel janv. 07 21:37:43 mageia.local dkms-autorebuild.sh[779]: dkms install -m nvidia-current -v 460.27.04-1.mga8.nonfree -k 5.10.4-desktop-3.mga8 -a x86_64 -q janv. 07 21:37:51 mageia.local kernel: nvidia: loading out-of-tree module taints kernel. janv. 07 21:37:51 mageia.local kernel: nvidia: module license 'NVIDIA' taints kernel. janv. 07 21:37:51 mageia.local kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 242 janv. 07 21:37:51 mageia.local kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem janv. 07 21:37:52 mageia.local kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 460.27.04 Fri Dec 11 23:24:19 UTC 2020 janv. 07 21:37:53 mageia.local kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver janv. 07 21:37:53 mageia.local kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0 janv. 07 21:37:54 mageia.local kernel: nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint. janv. 07 21:37:54 mageia.local kernel: nvidia-uvm: Loaded the UVM driver, major device number 239. janv. 07 21:43:29 mageia.local org.freedesktop.FileManager1[10490]: nvidia-current (460.27.04-1.mga8.nonfree): Installing module. Booting with 5.10.5-1.1.mga8 from Comment 12: janv. 07 21:45:41 mageia.local kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-5.10.5-desktop-1.1.mga8 root=UUID=7c97e985-1ddb-4058-a85d-611fdaa4e144 ro nouveau.modeset=0 nvidia.modeset=1 splash quiet noiswmd resume=UUID=235f257b-6f19-4537-af65-4655fb448221 audit=0 vga=791 janv. 07 21:45:42 mageia.local kernel: nvidia-gpu 0000:01:00.3: enabling device (0000 -> 0002) janv. 07 21:45:43 mageia.local kernel: nvidia: loading out-of-tree module taints kernel. janv. 07 21:45:43 mageia.local kernel: nvidia: module license 'NVIDIA' taints kernel. janv. 07 21:45:43 mageia.local kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 242 janv. 07 21:45:43 mageia.local kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem janv. 07 21:45:43 mageia.local systemd-udevd[565]: nvidia: Process '/usr/bin/bash -c '/usr/bin/test -c /dev/nvidiactl || /usr/bin/mknod -Z -m 666 /dev/nvidiactl c 195 255'' failed with exit code 1. janv. 07 21:45:43 mageia.local systemd-udevd[570]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \ -f 4); do /usr/bin/test -c /dev/nvidia${i} || /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c 195 ${i}; done'' failed with exit code 1. janv. 07 21:45:43 mageia.local kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 460.27.04 Fri Dec 11 23:24:19 UTC 2020 janv. 07 21:45:44 mageia.local kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver janv. 07 21:45:44 mageia.local kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0 janv. 07 21:45:45 mageia.local kernel: nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint. janv. 07 21:45:45 mageia.local kernel: nvidia-uvm: Loaded the UVM driver, major device number 239. janv. 07 21:45:48 mageia.local dkms-autorebuild.sh[833]: nvidia-current (460.27.04-1.mga8.nonfree): Already installed on this kernel. Also, redoing the mentioned udev warnings as root produces nothing in system journal. This is strange. Aother test... make a backup of the initrd for 5.10.4-3.mga8, then recreate it an reboot... does it still work ? Booted on 5.10.4-3.mga8. # cp /boot/initrd-5.10.4-desktop-3.mga8.img /boot/initrd-5.10.4-desktop-3.mga8.img.back # dracut --force initrd regenerated. (see timestamps updated with ll) Rebooted with it. $ journalctl -b | grep nvidia janv. 07 22:32:36 mageia.local kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-5.10.4-desktop-3.mga8 root=UUID=7c97e985-1ddb-4058-a85d-611fdaa4e144 ro nouveau.modeset=0 nvidia.modeset=1 splash quiet noiswmd resume=UUID=235f257b-6f19-4537-af65-4655fb448221 audit=0 vga=791 janv. 07 22:32:36 mageia.local kernel: nvidia-gpu 0000:01:00.3: enabling device (0000 -> 0002) janv. 07 22:32:37 mageia.local kernel: nvidia: loading out-of-tree module taints kernel. janv. 07 22:32:37 mageia.local kernel: nvidia: module license 'NVIDIA' taints kernel. janv. 07 22:32:37 mageia.local kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 242 janv. 07 22:32:37 mageia.local kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem janv. 07 22:32:38 mageia.local kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 455.45.01 Thu Nov 5 22:55:44 UTC 2020 janv. 07 22:32:39 mageia.local kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver janv. 07 22:32:39 mageia.local kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0 janv. 07 22:32:40 mageia.local kernel: nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint. janv. 07 22:32:40 mageia.local kernel: nvidia-uvm: Loaded the UVM driver, major device number 240. janv. 07 22:32:44 mageia.local dkms-autorebuild.sh[827]: nvidia-current (455.45.01-1.mga8.nonfree): Already installed on this kernel. No errors. ok, so that confirms it's the kernel. a new test, does 5.10.5-1.2 from: https://tmb.nu/Mageia/Cauldron/bugs/28035/ work ? (In reply to Thomas Backlund from comment #17) > ok, > so that confirms it's the kernel. > > a new test, does 5.10.5-1.2 from: > https://tmb.nu/Mageia/Cauldron/bugs/28035/ > > work ? This works ! (Note that I also downgraded to the nvidia-current that is in nonfree release repo). janv. 07 23:34:42 mageia.local kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-5.10.5-desktop-1.2.mga8 root=UUID=7c97e985-1ddb-4058-a85d-611fdaa4e144 ro nouveau.modeset=0 nvidia.modeset=1 splash quiet noiswmd resume=UUID=235f257b-6f19-4537-af65-4655fb448221 audit=0 vga=791 janv. 07 23:34:42 mageia.local kernel: nvidia-gpu 0000:01:00.3: enabling device (0000 -> 0002) janv. 07 23:34:44 mageia.local kernel: nvidia: loading out-of-tree module taints kernel. janv. 07 23:34:44 mageia.local kernel: nvidia: module license 'NVIDIA' taints kernel. janv. 07 23:34:44 mageia.local kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 242 janv. 07 23:34:44 mageia.local kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem janv. 07 23:34:44 mageia.local kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 455.45.01 Thu Nov 5 22:55:44 UTC 2020 janv. 07 23:34:45 mageia.local kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver janv. 07 23:34:45 mageia.local kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0 janv. 07 23:34:46 mageia.local kernel: nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint. janv. 07 23:34:46 mageia.local kernel: nvidia-uvm: Loaded the UVM driver, major device number 240. janv. 07 23:34:50 mageia.local dkms-autorebuild.sh[818]: nvidia-current (455.45.01-1.mga8.nonfree): Already installed on this kernel. Should I uninstall 5.10.5-1.2 from: https://tmb.nu/Mageia/Cauldron/bugs/28035/ right after so you can push an update to BS? Note to Guiseppe that https://www.nvidia.com/en-us/drivers/unix/ Linux x86_64/AMD64/EM64T Latest Production Branch Version: 460.32.03 Latest New Feature Branch Version: 455.45.01 I really wonder what is they name "Production Branch version". New LTS? Yes, it's the new LTS. I've updated in the SVN 5 minutes ago. I was writing the freeze push request documentation... Well, there is also a new, 390.141... (In reply to Aurelien Oudelet from comment #18) > (In reply to Thomas Backlund from comment #17) > > ok, > > so that confirms it's the kernel. > > > > a new test, does 5.10.5-1.2 from: > > https://tmb.nu/Mageia/Cauldron/bugs/28035/ > > > > work ? > > This works ! (Note that I also downgraded to the nvidia-current that is in > nonfree release repo). Great. so a "fix" for a kernel behaviour that dates back to kernel 2.6 series for udev module loading timings, request originating from a systemd/udev bugreport gives us griefs :) > Should I uninstall 5.10.5-1.2 from: > https://tmb.nu/Mageia/Cauldron/bugs/28035/ right after so you can push an > update to BS? No need, you can keep running that for now. The next kernel that will land in cauldron is 5.10.6-1 Kernel-5.10.6-1.mga8 + nvidia-current-460.32.03-1.mga8.nonfree Fixed. Resolution:
(none) =>
FIXED |
Created attachment 12190 [details] packages installed or updated since january 4th 2021. Bad udev warning since Kernel 5.10.4-4.mga8 (also with kernel 5.10.5-1.mga8) systemd-udevd[554]: nvidia: Process '/usr/bin/bash -c '/usr/bin/test -c /dev/nvidiactl || /usr/bin/mknod -Z -m 666 /dev/nvidiactl c 195 255'' failed with exit code 1. janv. 05 18:01:09 mageia.local systemd-udevd[539]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \ -f 4); do /usr/bin/test -c /dev/nvidia${i} || /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c 195 ${i}; done'' failed with exit code 1. Nothing has changed with nvidia driver since beta2. Adding rpm -qa --last output since January 4th 2021. nvidia drivers seems to works fine with Plasma meanwhile. Compositor and 3D effects run well. Assigning to Kernel and Drivers team.