Bug 23587 - Shutdown hang - nouveau strangeness
Summary: Shutdown hang - nouveau strangeness
Status: NEW
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-16 21:12 CEST by Frank Griffin
Modified: 2019-02-19 16:36 CET (History)
1 user (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments

Description Frank Griffin 2018-09-16 21:12:31 CEST
For a month or so now I have been seeing a shutdown hang, one that can be broken by alt-SysRq-E.  When the magic key is used, I get the following kernel stacktrace:

Sep 15 17:34:15 ftglap kernel: ------------[ cut here ]------------
Sep 15 17:34:15 ftglap kernel: nouveau 0000:01:00.0: timeout
Sep 15 17:34:15 ftglap kernel: WARNING: CPU: 0 PID: 6211 at drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c:86 nvkm_pmu_reset+0x14c/0x160 [nouveau]
Sep 15 17:34:15 ftglap kernel: Modules linked in: cmac rfcomm rpcsec_gss_krb5 nfsv4 nfs fscache ip_vs nf_conntrack af_packet vboxnetadp(O) vboxnetflt(O) vboxdrv(O) bnep dm_mirror dm_region_hash dm_log dm_mod joydev arc4 hid_multitouch hid_generic spi_pxa2xx_platform 8250_dw snd_soc_skl uvcvideo snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp iTCO_wdt videobuf2_vmalloc snd_hda_ext_core snd_soc_acpi videobuf2_memops videobuf2_v4l2 iTCO_vendor_support videobuf2_common snd_soc_core videodev intel_rapl snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp iwlmvm coretemp snd_hda_codec_generic btusb media kvm_intel btrtl snd_compress btbcm ac97_bus btintel mac80211 kvm snd_hda_intel bluetooth snd_hda_codec irqbypass crc32_pclmul ecdh_generic crc32c_intel ghash_clmulni_intel snd_hda_core pcbc iwlwifi snd_hwdep snd_pcm
@Sep 15 17:34:15 ftglap kernel:  snd_timer aesni_intel aes_x86_64 crypto_simd cfg80211 cryptd glue_helper snd intel_cstate intel_uncore ipmi_devintf idma64 intel_rapl_perf virt_dma mei_me soundcore input_leds asus_nb_wmi mei asus_wmi tpm_crb i2c_i801 sparse_keymap processor_thermal_device intel_lpss_pci tpm_tis rfkill intel_pch_thermal wmi_bmof intel_lpss intel_soc_dts_iosf thermal tpm_tis_core battery tpm int3400_thermal int3403_thermal acpi_thermal_rel ac int340x_thermal_zone rng_core asus_wireless evdev acpi_pad nfsd vboxguest cuse fuse auth_rpcgss nfs_acl lockd grace nvram sunrpc sch_fq_codel ip_tables x_tables ipv6 crc_ccitt autofs4 ipmi_msghandler nouveau xhci_pci xhci_hcd usbcore ttm serio_raw usb_common i915 i2c_hid hid mxm_wmi i2c_algo_bit drm_kms_helper wmi video button drm
Sep 15 17:34:15 ftglap kernel: CPU: 0 PID: 6211 Comm: net_applet Tainted: P        W  O      4.18.8-desktop-1.mga7 #1
Sep 15 17:34:15 ftglap kernel: CPU: 0 PID: 6211 Comm: net_applet Tainted: P        W  O      4.18.8-desktop-1.mga7 #1
Sep 15 17:34:15 ftglap kernel: Hardware name: ASUSTeK COMPUTER INC. X510UNR/X510UNR, BIOS X510UNR.302 11/14/2017
Sep 15 17:34:15 ftglap kernel: RIP: 0010:nvkm_pmu_reset+0x14c/0x160 [nouveau]
Sep 15 17:34:15 ftglap kernel: Code: 5c 41 5d 41 5e c3 48 8b 7d 10 48 8b 5f 50 48 85 db 74 1e e8 16 0b 21 e0 48 89 da 48 c7 c7 d5 7a 55 c0 48 89 c6 e8 ce 31 c7 df <0f> 0b e9 42 ff ff ff 48 8b 5f 10 eb dc 48 8b 5f 10 eb a5 90 0f 1f
Sep 15 17:34:15 ftglap kernel: RSP: 0018:ffffa35a827dfb78 EFLAGS: 00010286
Sep 15 17:34:15 ftglap kernel: RAX: 0000000000000000 RBX: ffff9047a4b02380 RCX: 0000000000000006
Sep 15 17:34:15 ftglap kernel: RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff9047aec16590
Sep 15 17:34:15 ftglap kernel: RBP: ffff9047a16e2400 R08: 0000000000000030 R09: 0000000000000004
Sep 15 17:34:15 ftglap kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff90479ecabde0
Sep 15 17:34:15 ftglap kernel: R13: ffff9047a1363cc0 R14: 0000000847f8c180 R15: ffff9047a4b8b0a0
Sep 15 17:34:15 ftglap kernel: FS:  00007f9faa5ba740(0000) GS:ffff9047aec00000(0000) knlGS:0000000000000000
Sep 15 17:34:15 ftglap kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 15 17:34:15 ftglap kernel: CR2: 00007fb0d38c4000 CR3: 0000000209dde005 CR4: 00000000003606f0
Sep 15 17:34:15 ftglap kernel: Call Trace:
Sep 15 17:34:15 ftglap kernel:  nvkm_pmu_init+0x16/0x40 [nouveau]
Sep 15 17:34:15 ftglap kernel:  nvkm_subdev_init+0xb2/0x200 [nouveau]
Sep 15 17:34:15 ftglap kernel:  nvkm_device_init+0x123/0x280 [nouveau]
Sep 15 17:34:15 ftglap kernel:  nvkm_udevice_init+0x41/0x60 [nouveau]
Sep 15 17:34:15 ftglap kernel:  nvkm_object_init+0x3e/0x100 [nouveau]
Sep 15 17:34:15 ftglap kernel:  nvkm_object_init+0x71/0x100 [nouveau]
Sep 15 17:34:15 ftglap kernel:  nvkm_object_init+0x71/0x100 [nouveau]
Sep 15 17:34:15 ftglap kernel:  nouveau_do_resume+0x28/0x150 [nouveau]
Sep 15 17:34:15 ftglap kernel:  nouveau_pmops_runtime_resume+0x88/0x150 [nouveau]
Sep 15 17:34:15 ftglap kernel:  pci_pm_runtime_resume+0x78/0xb0
Sep 15 17:34:15 ftglap kernel:  ? pci_restore_standard_config+0x40/0x40

So here's the strangeness: There are two video chips on this laptop.  One is an Intel 810 or later, and one is an nvidia GEForce.  I have dkms-nvidia installed.  When I run XFdrake, it selects the Intel chip and ignores the nvidia.  I get no prompt asking if I want to use a proprietary driver for nvidia.

So why is nouveau (which I think is an open-source nvidia driver) running at all ?  If the system is using the Intel chip, neither nvidia nor nouveau should be running.
Marja Van Waes 2018-09-18 15:23:21 CEST

Assignee: bugsquad => kernel
CC: (none) => marja11

Comment 1 Frank Griffin 2018-12-04 19:04:08 CET
Still happening in current cauldron.
Comment 2 Frank Griffin 2019-02-19 16:35:13 CET
Minor changes, but still happening.  The hang is different, and occurs after the stacktrace at "Starting power off".  The new trace is:

Feb 17 09:53:29 ftglap kernel: ------------[ cut here ]------------
Feb 17 09:53:29 ftglap kernel: nouveau 0000:01:00.0: timeout
Feb 17 09:53:29 ftglap kernel: WARNING: CPU: 1 PID: 11912 at drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c:86 nvkm_pmu_reset+0x14c/0x160 [nouveau]
Feb 17 09:53:29 ftglap kernel: Modules linked in: iptable_filter uas rpcsec_gss_krb5 nfsv4 nfs fscache cmac rfcomm ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 af_packet bnep binfmt_misc sr_mod dm_mirror dm_region_hash dm_log dm_mod joydev arc4 hid_multitouch hid_generic spi_pxa2xx_platform 8250_dw uvcvideo intel_rapl videobuf2_vmalloc videobuf2_memops x86_pkg_temp_thermal intel_powerclamp videobuf2_v4l2 btusb videobuf2_common coretemp kvm_intel btbcm btrtl snd_soc_skl iwlmvm btintel videodev snd_soc_hdac_hda bluetooth kvm snd_hda_ext_core snd_soc_skl_ipc mac80211 snd_soc_sst_ipc snd_soc_sst_dsp iTCO_wdt iTCO_vendor_support snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_hdmi snd_hda_codec_generic snd_soc_core irqbypass media usb_storage ecdh_generic crc32_pclmul crc32c_intel snd_compress ac97_bus ghash_clmulni_intel snd_hda_intel iwlwifi snd_hda_codec aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate snd_hda_core intel_uncore snd_hwdep intel_rapl_perf snd_pcm cfg80211 idma64
Feb 17 09:53:29 ftglap kernel:  asus_nb_wmi asus_wmi virt_dma input_leds sparse_keymap snd_timer wmi_bmof snd rfkill mei_me tpm_crb processor_thermal_device mei intel_lpss_pci soundcore tpm_tis intel_lpss intel_pch_thermal intel_soc_dts_iosf thermal i2c_i801 tpm_tis_core tpm ac int3403_thermal int3400_thermal battery rng_core acpi_thermal_rel int340x_thermal_zone asus_wireless acpi_pad evdev nfsd cuse fuse auth_rpcgss nfs_acl lockd grace nvram sunrpc sch_fq_codel ip_tables x_tables ipv6 crc_ccitt autofs4 nouveau xhci_pci xhci_hcd usbcore ttm serio_raw usb_common i915 i2c_hid mxm_wmi hid i2c_algo_bit drm_kms_helper video wmi button drm [last unloaded: vboxdrv]
Feb 17 09:53:29 ftglap kernel: Modules linked in: iptable_filter uas rpcsec_gss_krb5 nfsv4 nfs fscache cmac rfcomm ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 af_packet bnep binfmt_misc sr_mod dm_mirror dm_region_hash dm_log dm_mod joydev arc4 hid_multitouch hid_generic spi_pxa2xx_platform 8250_dw uvcvideo intel_rapl videobuf2_vmalloc videobuf2_memops x86_pkg_temp_thermal intel_powerclamp videobuf2_v4l2 btusb videobuf2_common coretemp kvm_intel btbcm btrtl snd_soc_skl iwlmvm btintel videodev snd_soc_hdac_hda bluetooth kvm snd_hda_ext_core snd_soc_skl_ipc mac80211 snd_soc_sst_ipc snd_soc_sst_dsp iTCO_wdt iTCO_vendor_support snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_hdmi snd_hda_codec_generic snd_soc_core irqbypass media usb_storage ecdh_generic crc32_pclmul crc32c_intel snd_compress ac97_bus ghash_clmulni_intel snd_hda_intel iwlwifi snd_hda_codec aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate snd_hda_core intel_uncore snd_hwdep intel_rapl_perf snd_pcm cfg80211 idma64
Feb 17 09:53:29 ftglap kernel:  asus_nb_wmi asus_wmi virt_dma input_leds sparse_keymap snd_timer wmi_bmof snd rfkill mei_me tpm_crb processor_thermal_device mei intel_lpss_pci soundcore tpm_tis intel_lpss intel_pch_thermal intel_soc_dts_iosf thermal i2c_i801 tpm_tis_core tpm ac int3403_thermal int3400_thermal battery rng_core acpi_thermal_rel int340x_thermal_zone asus_wireless acpi_pad evdev nfsd cuse fuse auth_rpcgss nfs_acl lockd grace nvram sunrpc sch_fq_codel ip_tables x_tables ipv6 crc_ccitt autofs4 nouveau xhci_pci xhci_hcd usbcore ttm serio_raw usb_common i915 i2c_hid mxm_wmi hid i2c_algo_bit drm_kms_helper video wmi button drm [last unloaded: vboxdrv]
Feb 17 09:53:29 ftglap kernel: CPU: 1 PID: 11912 Comm: plymouthd Tainted: G        W  O      4.20.9-desktop-1.mga7 #1
Feb 17 09:53:29 ftglap kernel: Hardware name: ASUSTeK COMPUTER INC. X510UNR/X510UNR, BIOS X510UNR.302 11/14/2017
Feb 17 09:53:29 ftglap kernel: RIP: 0010:nvkm_pmu_reset+0x14c/0x160 [nouveau]
Feb 17 09:53:29 ftglap kernel: Code: 5c 41 5d 41 5e c3 48 8b 7d 10 48 8b 5f 50 48 85 db 74 1e e8 06 5e f6 de 48 89 da 48 c7 c7 31 93 84 c0 48 89 c6 e8 ce 6f 98 de <0f> 0b e9 42 ff ff ff 48 8b 5f 10 eb dc 48 8b 5f 10 eb a5 90 0f 1f
Feb 17 09:53:29 ftglap kernel: RSP: 0018:ffffb07483edb908 EFLAGS: 00010286
Feb 17 09:53:29 ftglap kernel: RAX: 0000000000000000 RBX: ffff979824805a10 RCX: 0000000000000006
Feb 17 09:53:29 ftglap kernel: RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff979826a564c0
Feb 17 09:53:29 ftglap kernel: RBP: ffff9798249ac800 R08: 0000000000000030 R09: 0000000000000004
Feb 17 09:53:29 ftglap kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff97981f88ef60
Feb 17 09:53:29 ftglap kernel: R13: ffff979821ed8a80 R14: 0000424c21f67960 R15: ffff9798248340b0
Feb 17 09:53:29 ftglap kernel: FS:  00007f664ee4c740(0000) GS:ffff979826a40000(0000) knlGS:0000000000000000
Feb 17 09:53:29 ftglap kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 17 09:53:29 ftglap kernel: CR2: 000000000227cac8 CR3: 00000001e31ee006 CR4: 00000000003606e0
Feb 17 09:53:29 ftglap kernel: Call Trace:
Feb 17 09:53:29 ftglap kernel:  nvkm_pmu_init+0x16/0x40 [nouveau]
Feb 17 09:53:29 ftglap kernel:  nvkm_subdev_init+0xb2/0x200 [nouveau]
Feb 17 09:53:29 ftglap kernel:  nvkm_device_init+0x123/0x280 [nouveau]
Feb 17 09:53:29 ftglap kernel:  nvkm_udevice_init+0x41/0x60 [nouveau]
Feb 17 09:53:29 ftglap kernel:  nvkm_object_init+0x3e/0x100 [nouveau]
Feb 17 09:53:29 ftglap kernel:  nvkm_object_init+0x71/0x100 [nouveau]
Feb 17 09:53:29 ftglap kernel:  nvkm_object_init+0x71/0x100 [nouveau]
Feb 17 09:53:29 ftglap kernel:  nouveau_do_resume+0x28/0x150 [nouveau]
Feb 17 09:53:29 ftglap kernel:  nouveau_pmops_runtime_resume+0x88/0x150 [nouveau]
Feb 17 09:53:29 ftglap kernel:  pci_pm_runtime_resume+0x74/0xd0
Feb 17 09:53:29 ftglap kernel:  ? pci_restore_standard_config+0x40/0x40
Feb 17 09:53:29 ftglap kernel:  __rpm_callback+0xca/0x1d0
Feb 17 09:53:29 ftglap kernel:  ? pci_restore_standard_config+0x40/0x40
Feb 17 09:53:29 ftglap kernel:  rpm_callback+0x1f/0x70
Feb 17 09:53:29 ftglap kernel:  ? pci_restore_standard_config+0x40/0x40
Feb 17 09:53:29 ftglap kernel:  rpm_resume+0x5c0/0x7f0
Feb 17 09:53:29 ftglap kernel:  __pm_runtime_resume+0x47/0x70
Feb 17 09:53:29 ftglap kernel:  nouveau_drm_open+0x39/0x170 [nouveau]
Feb 17 09:53:29 ftglap kernel:  ? _cond_resched+0x15/0x30
Feb 17 09:53:29 ftglap kernel:  ? kmem_cache_alloc_trace+0x1bb/0x1d0
Feb 17 09:53:29 ftglap kernel:  ? cap_inode_getsecurity+0x240/0x240
Feb 17 09:53:29 ftglap kernel:  drm_file_alloc+0x167/0x250 [drm]
Feb 17 09:53:29 ftglap kernel:  drm_open+0xa7/0x1e0 [drm]
Feb 17 09:53:29 ftglap kernel:  drm_stub_open+0xaf/0xf0 [drm]
Feb 17 09:53:29 ftglap kernel:  chrdev_open+0x9e/0x1a0
Feb 17 09:53:29 ftglap kernel:  ? cdev_put.part.3+0x20/0x20
Feb 17 09:53:29 ftglap kernel:  do_dentry_open+0x12f/0x330
Feb 17 09:53:29 ftglap kernel:  path_openat+0x32c/0x16a0
Feb 17 09:53:29 ftglap kernel:  ? filename_lookup.part.63+0xe0/0x170
Feb 17 09:53:29 ftglap kernel:  ? __check_object_size+0x15d/0x189
Feb 17 09:53:29 ftglap kernel:  do_filp_open+0x93/0x100
Feb 17 09:53:29 ftglap kernel:  ? vfs_statx+0x73/0xe0
Feb 17 09:53:29 ftglap kernel:  ? __check_object_size+0x15d/0x189
Feb 17 09:53:29 ftglap kernel:  do_sys_open+0x186/0x210
Feb 17 09:53:29 ftglap kernel:  do_syscall_64+0x55/0x100
Feb 17 09:53:29 ftglap kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Feb 17 09:53:29 ftglap kernel: RIP: 0033:0x7f664f0d8a10
Feb 17 09:53:29 ftglap kernel: Code: 25 00 00 41 00 3d 00 00 41 00 74 36 48 8d 05 d7 59 0d 00 8b 00 85 c0 75 5a 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 84 00 00 00 48 83 c4 68 5b 5d c3 0f 1f 44
Feb 17 09:53:29 ftglap kernel: RSP: 002b:00007ffc6bc89260 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
Feb 17 09:53:29 ftglap kernel: RAX: ffffffffffffffda RBX: 0000000002099f70 RCX: 00007f664f0d8a10
Feb 17 09:53:29 ftglap kernel: RDX: 0000000000000002 RSI: 000000000209b140 RDI: 00000000ffffff9c
Feb 17 09:53:29 ftglap kernel: RBP: 0000000000000002 R08: 000000000209a4c0 R09: 0000000000000000
Feb 17 09:53:29 ftglap kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007f664ee4c6c0
Feb 17 09:53:29 ftglap kernel: R13: 0000000000000000 R14: 00007f664f1c3538 R15: 00007f664f1c34a8
Feb 17 09:53:29 ftglap kernel: ---[ end trace ed5cb8771bf878dc ]---
Feb 17 09:53:29 ftglap systemd[1]: Started Show Plymouth Power Off Screen.
Feb 17 09:53:29 ftglap systemd[1]: Stopped Session c1 of user sddm.
Feb 17 09:53:29 ftglap systemd[1]: Removed slice User Slice of sddm.
Feb 17 09:53:29 ftglap systemd[1]: Stopping Login Service...
Feb 17 09:53:29 ftglap systemd[1]: Stopping Permit User Sessions...
Feb 17 09:53:29 ftglap systemd[1]: Stopped Permit User Sessions.
Feb 17 09:53:29 ftglap systemd[1]: Stopped target Remote File Systems.
Feb 17 09:53:29 ftglap systemd[1]: Unmounting /mnt/ftgme2.data...
Feb 17 09:53:29 ftglap systemd[1]: Unmounting /data/ftg/.thunderbird...
Feb 17 09:53:29 ftglap systemd[1]: Unmounting /mnt/ftgme2.data2...
Feb 17 09:53:29 ftglap systemd[1]: Unmounting /mnt/ftgme2...
Feb 17 09:53:29 ftglap systemd[1]: Unmounting /mnt/cauldron...
Feb 17 09:53:29 ftglap systemd[1]: Unmounting /mnt/ftgme2.usr.local...
Comment 3 Frank Griffin 2019-02-19 16:36:25 CET
Another change is that the oops happens without use of the magic key.

Note You need to log in before you can comment on or make changes to this bug.