Bug 27435 - Kernel Trap each boot since 5.9 version with nvidia_{drm,uvm}-455.45.01-1.mga8
Summary: Kernel Trap each boot since 5.9 version with nvidia_{drm,uvm}-455.45.01-1.mga8
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal major
Target Milestone: Mageia 8
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-16 15:46 CEST by Aurelien Oudelet
Modified: 2020-12-27 14:52 CET (History)
1 user (show)

See Also:
Source RPM: nvidia-current-455.45.01-1.mga8.nonfree.src.rpm
CVE:
Status comment:


Attachments
nvidia_drm kernel trap (6.21 KB, text/plain)
2020-10-16 15:46 CEST, Aurelien Oudelet
Details

Description Aurelien Oudelet 2020-10-16 15:46:13 CEST
Created attachment 11944 [details]
nvidia_drm kernel trap

Mageia Cauldron
Since Kernel 5.9.0 in Cauldron, each system reboot, there is one Kernel Trap:

--- related begin system boot ---
oct. 16 14:33:36 kernel: Linux version 5.9.0-desktop-2.mga8 (iurt@ecosse.mageia.org) (gcc (Mageia 10.2.0-2.mga8) 10.2.0, GNU ld (GNU Binutils) 2.34) #1 SMP Thu Oct 15 12:25:06 UTC 2020
oct. 16 14:33:36 kernel: Command line: BOOT_IMAGE=/vmlinuz-5.9.0-desktop-2.mga8 root=UUID=e52f7a49-10ac-47b4-a6ab-50eb7388003b ro nvidia-drm.modeset=1 nouveau.modeset=0 splash quiet noiswmd audit=0
oct. 16 14:33:37 kernel: nvidia: loading out-of-tree module taints kernel.
oct. 16 14:33:37 kernel: nvidia: module license 'NVIDIA' taints kernel.
oct. 16 14:33:37 kernel: Disabling lock debugging due to kernel taint
oct. 16 14:33:38 kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  455.28  Wed Sep 30 01:05:16 UTC 2020
oct. 16 14:33:38 kernel: ucsi_ccg: probe of 0-0008 failed with error -110

oct. 16 14:34:09 kernel: ------------[ cut here ]------------
oct. 16 14:34:09 kernel: WARNING: CPU: 1 PID: 3207 at /var/lib/dkms/nvidia-current/455.28-4.mga8.nonfree/build/nvidia-drm/nvidia-drm-drv.c:534 nv_drm_master_set+0x22/0x30 [nvidia_drm]
oct. 16 14:34:09 kernel: Modules linked in: rfcomm ip6t_REJECT nf_reject_ipv6 xt_comment ip6table_mangle ip6table_nat ip6table_raw nf_log_ipv6 ip6table_filter ip6_tables xt_recent ipt_REJECT nf_reject_ipv4 xt_multiport xt_conntrack xt_hashlimit xt_addrtype xt_mark iptable_mangle iptable_nat xt_CT xt_tcpudp iptable_raw nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_nat nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter af_packet uinput vboxnetadp(O) vboxnetflt(O) cmac algif_hash algif_skcipher af_alg vboxdrv(O) binfmt_misc bnep msr xfs nls_utf8 nls_cp437 vfat fat dm_mirror dm_region_hash dm_log dm_mod nvidia_uvm(O) nvidia_drm(PO) drm_kms_helper
oct. 16 14:34:09 kernel:  cec nvidia_modeset(PO) nvidia(PO) input_leds joydev btusb btbcm btrtl btintel bluetooth hid_generic ecdh_generic ecc usbhid hid iwlmvm mac80211 libarc4 intel_rapl_msr intel_rapl_common iwlwifi x86_pkg_temp_thermal intel_powerclamp ucsi_ccg coretemp kvm_intel typec_ucsi kvm mei_hdcp ee1004 iTCO_wdt iTCO_vendor_support typec snd_hda_codec_hdmi snd_hda_codec_realtek irqbypass crc32_pclmul snd_hda_codec_generic ledtrig_audio ghash_clmulni_intel snd_hda_intel aesni_intel cfg80211 crypto_simd snd_intel_dspcfg cryptd glue_helper snd_hda_codec rapl intel_cstate snd_hda_core snd_hwdep e1000e intel_uncore psmouse snd_pcm intel_wmi_thunderbolt i2c_i801 ptp mxm_wmi pps_core snd_timer i2c_smbus snd i2c_nvidia_gpu rfkill mei_me soundcore mei fan thermal acpi_pad button evdev nvram sch_fq_codel drm efivarfs ip_tables x_tables ipv6 crc_ccitt autofs4 crc32c_intel xhci_pci xhci_hcd serio_raw usbcore usb_common wmi video
oct. 16 14:34:09 kernel: CPU: 1 PID: 3207 Comm: gst-plugin-scan Tainted: P           O      5.9.0-desktop-2.mga8 #1
oct. 16 14:34:09 kernel: Hardware name: Gigabyte Technology Co., Ltd. Z170X-Ultra Gaming/Z170X-Ultra Gaming-CF, BIOS F23j 03/09/2018
oct. 16 14:34:09 kernel: RIP: 0010:nv_drm_master_set+0x22/0x30 [nvidia_drm]
oct. 16 14:34:09 kernel: Code: 94 c5 fc ca 0f 1f 40 00 0f 1f 44 00 00 48 8b 47 48 48 8b 78 20 48 8b 05 9c 5c 00 00 48 8b 40 28 e8 d3 97 1f cb 84 c0 74 01 c3 <0f> 0b c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 80 3d fc
oct. 16 14:34:09 kernel: RSP: 0018:ffffac844365bbe0 EFLAGS: 00010246
oct. 16 14:34:09 kernel: RAX: 0000000000000000 RBX: ffff9cab78f6ec00 RCX: 0000000000000008
oct. 16 14:34:09 kernel: RDX: ffffffffc4578e98 RSI: 0000000000000296 RDI: 0000000000000296
oct. 16 14:34:09 kernel: RBP: ffff9cab77639f00 R08: 0000000000000008 R09: ffffac844365bbc8
oct. 16 14:34:09 kernel: R10: ffff9cab77639f00 R11: 0000000000000006 R12: ffff9cabbdc6d800
oct. 16 14:34:09 kernel: R13: 0000000000000000 R14: 0000000000000003 R15: ffff9cabbdc6d800
oct. 16 14:34:09 kernel: FS:  00007f85c0e83740(0000) GS:ffff9cabcec80000(0000) knlGS:0000000000000000
oct. 16 14:34:09 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
oct. 16 14:34:09 kernel: CR2: 00007f85c0068000 CR3: 000000043763e004 CR4: 00000000003706e0
oct. 16 14:34:09 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
oct. 16 14:34:09 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
oct. 16 14:34:09 kernel: Call Trace:
oct. 16 14:34:09 kernel:  drm_new_set_master+0x7a/0x100 [drm]
oct. 16 14:34:09 kernel:  drm_master_open+0x68/0x90 [drm]
oct. 16 14:34:09 kernel:  drm_open+0xf8/0x270 [drm]
oct. 16 14:34:09 kernel:  drm_stub_open+0xab/0x130 [drm]
oct. 16 14:34:09 kernel:  chrdev_open+0xbe/0x230
oct. 16 14:34:09 kernel:  ? cdev_device_add+0x90/0x90
oct. 16 14:34:09 kernel:  do_dentry_open+0x14b/0x360
oct. 16 14:34:09 kernel:  path_openat+0xc2c/0x10d0
oct. 16 14:34:09 kernel:  do_filp_open+0x88/0x130
oct. 16 14:34:09 kernel:  ? getname_flags.part.0+0x29/0x1a0
oct. 16 14:34:09 kernel:  ? __check_object_size+0x136/0x147
oct. 16 14:34:09 kernel:  do_sys_openat2+0x97/0x150
oct. 16 14:34:09 kernel:  __x64_sys_openat+0x54/0x90
oct. 16 14:34:09 kernel:  do_syscall_64+0x33/0x80
oct. 16 14:34:09 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
oct. 16 14:34:09 kernel: RIP: 0033:0x7f85c13bf6d7
oct. 16 14:34:09 kernel: Code: 25 00 00 41 00 3d 00 00 41 00 74 37 64 8b 04 25 18 00 00 00 85 c0 75 5b 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 85 00 00 00 48 83 c4 68 5d 41 5c c3 0f 1f
oct. 16 14:34:09 kernel: RSP: 002b:00007ffdcfff5910 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
oct. 16 14:34:09 kernel: RAX: ffffffffffffffda RBX: 0000000000f17120 RCX: 00007f85c13bf6d7
oct. 16 14:34:09 kernel: RDX: 0000000000080002 RSI: 0000000000f1c970 RDI: 00000000ffffff9c
oct. 16 14:34:09 kernel: RBP: 0000000000f1c970 R08: 00007f85c00474a0 R09: 0000000000f1cba8
oct. 16 14:34:09 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000080002
oct. 16 14:34:09 kernel: R13: 00007f85c0442f10 R14: 0000000000ef1400 R15: 00007f85c0420dd6
oct. 16 14:34:09 kernel: ---[ end trace 3b77f90462a884c5 ]---

GUI is still available.
3D is OK with Plasma.
Comment 1 Aurelien Oudelet 2020-10-16 15:49:27 CEST
VT switching is OK.

Assigning to Kernel and Drivers maintainers.
CC'd recent commiter.
Comment 2 Aurelien Oudelet 2020-10-19 17:56:02 CEST
nvidia_uvm module is blocked by GPL condom-style.

Upstream says:

Due to an incompatibility issue, we advise customers to defer updating to Linux Kernel 5.9+ until mid-November when an NVIDIA Linux GPU driver update with Kernel 5.9+ support is expected to be available.

Linux Kernel 5.9+ is incompatible with current and previous NVIDIA Linux GPU drivers. We advise customers to defer updating to Linux Kernel 5.9+ until mid-November when an NVIDIA Linux GPU driver update with Kernel 5.9+ support is expected to be available. NVIDIA is aware of the impact this will have on customers, and we are working diligently to provide the driver update with Kernel 5.9+ support as soon as possible.

Customers must use our upcoming driver update on Kernel 5.9+ to have a fully functioning driver.

https://forums.developer.nvidia.com/t/nvidia-driver-not-yet-supported-for-linux-kernel-5-9/157263


Leaving this open until upstream update to nonfree drivers.

Keywords: (none) => UPSTREAM

Comment 3 Aurelien Oudelet 2020-10-19 18:01:35 CEST
Upstream Kernel Linux commit is:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0fd9cc6b0c72245375520ffc8d97ce5857b63b94

author	Linus Torvalds <torvalds@linux-foundation.org>	2020-08-14 11:07:02 -0700
committer	Linus Torvalds <torvalds@linux-foundation.org>	2020-08-14 11:07:02 -0700
commit	0fd9cc6b0c72245375520ffc8d97ce5857b63b94 (patch)
tree	97bcbc980dc81f87056e8ab6bdbb37af72a341ec
parent	32b2ee5cea4d281f4f3f5a34d6363d1841422040 (diff)
parent	262e6ae7081df304fc625cf368d5c2cbba2bb991 (diff)
download	linux-0fd9cc6b0c72245375520ffc8d97ce5857b63b94.tar.gz
Merge tag 'modules-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull module updates from Jessica Yu:
 "The most important change would be Christoph Hellwig's patch
  implementing proprietary taint inheritance, in an effort to discourage
  the creation of GPL "shim" modules that interface between GPL symbols
  and proprietary symbols.

  Summary:

   - Have modules that use symbols from proprietary modules inherit the
     TAINT_PROPRIETARY_MODULE taint, in an effort to prevent GPL shim
     modules that are used to circumvent _GPL exports. These are modules
     that claim to be GPL licensed while also using symbols from
     proprietary modules. Such modules will be rejected while non-GPL
     modules will inherit the proprietary taint.

   - Module export space cleanup. Unexport symbols that are unused
     outside of module.c or otherwise used in only built-in code"

* tag 'modules-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  modules: inherit TAINT_PROPRIETARY_MODULE
  modules: return licensing information from find_symbol
  modules: rename the licence field in struct symsearch to license
  modules: unexport __module_address
  modules: unexport __module_text_address
  modules: mark each_symbol_section static
  modules: mark find_symbol static
  modules: mark ref_module static

Reverting back this is a GPL-licence violation and should be avoided.

Status: NEW => ASSIGNED

Comment 4 Aurelien Oudelet 2020-12-14 15:01:06 CET
nvidia-current-455.45.01-1.mga8.nonfree.src.rpm

Appears that there is no longer any kernel trap on a clean install of Mageia 8 Beta 2.

Leaving this as is for 2 weeks and will close it before RC.

Status: ASSIGNED => NEW

Aurelien Oudelet 2020-12-14 15:02:58 CET

Source RPM: nvidia-current-455.28-4.mga8.nonfree.src.rpm => nvidia-current-455.45.01-1.mga8.nonfree.src.rpm
Keywords: UPSTREAM => (none)
Summary: Kernel Trap each boot since 5.9.0-x.mga8 with nvidia_drm-455.28-4.mga8 => Kernel Trap each boot since 5.9 version with nvidia_{drm,uvm}-455.45.01-1.mga8

Comment 5 Aurelien Oudelet 2020-12-27 14:52:16 CET
In facts, nvidia-current-455.45.01-1.mga8.nonfree.src.rpm really fixes this.

Closing.

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.