Picked up from https://bugs.mageia.org/show_bug.cgi?id=32537#c7 Discussed now and then in various kernel bugs and dev and qa mail lists with Giuseppe Ghibò since i upgraded the system from mga8 to 9 beta+. The problem do only appear when using Proprietary drivers, (470, 525, 535, 545), and not nouveau (painfully slow!) nor Xorg modesetting (decent, but proprietary is faster). ___Version-Release number of selected component (if applicable): § Any Mageia 9 desktop kernel § No problem with linus kernels, currently using 6.5.11-2 § Mageia 8 desktop kernels was OK on same GPU, main board, CPU We have tried different DM (i.e SDDM 0.20.0 grom Ghibó) and DE, an all desktop kernels and nvidia versions so far, no result. Currently using Mageia sddm 0.19, Plasma on X11, Nvidia newfeature 545.29.02, also tested nvidia470 470.223.02-1. ___Hardware GPU: GM107; NVIDIA GeForce GTX 750, VBIOS version 82.07.32.00.52 Chipset: Intel P55 CPU: Intel i7-870 Monitor: Philips PHL 436M6VBP, connected by DisplayPort ___How reproducible, Steps to Reproduce: 1. Suspend the system 2. Resume (hit a keyboard key) 3. Often the monitor wakes up and show the lock screen, but sometimes monitor only wakes up to tell there is no signal and goes back to sleep. You then have to power cycle the monitor, then it show lock screen. It seem to fail mostly when system have been sleeping long (maybe monitor go in deeper sleep and do not respond to card quickly enough? But why would desktop/linus kernels differ?) --- ___Test plan: A) Retest with server kernel, 6.5.11 B) Retest with kernel-desktop-devel-5.10.191-2.lowlatency.ck.500hz.mga9-1-1.mga9.x86_64 Giuseppe built, from https://copr-be.cloud.fedoraproject.org/results/ghibo/mageia9-bonus/mageia-9-x86_64/, was working in august, still OK with nvidia470? C) Giuseppe to suggest/build test kernels outside Mageia repos.
(In reply to Morgan Leijström from comment #0) > A) Retest with server kernel, 6.5.11 kernel-server-6.5.11-5.mga9.x86_64 works/fails like desktop: short sleep resume OK; then i let it sleep for an hour, then i had to power cycle monitor after resume. Using newfeature 545.29.02. Now on to test B, using newfeature 545.29.02 - dkms built the module. OK short sleep, and now i go to bed myself, testing resume tomorrow. Then maybe a late lowlatency kernel from same repo.
Test part B in progress § OK: kernel-desktop-devel-5.10.191-2.lowlatency.ck.500hz.mga9-1-1.mga9.x86 § Fail: kernel-desktop-6.4.16-9.lowlatency.mga9-1-1.mga9.x86_64 Now on to test kernel-desktop-6.4.11-16.lowlatency.mga9-1-1.mga9.x86_64, then kernel-desktop-6.1.47-2.lowlatency.mga9-1-1.mga9.x86_64, choosing blindly
(In reply to Morgan Leijström from comment #2) > Test part B in progress > > § OK: kernel-desktop-devel-5.10.191-2.lowlatency.ck.500hz.mga9-1-1.mga9.x86 > > § Fail: kernel-desktop-6.4.16-9.lowlatency.mga9-1-1.mga9.x86_64 > > Now on to test kernel-desktop-6.4.11-16.lowlatency.mga9-1-1.mga9.x86_64, > then kernel-desktop-6.1.47-2.lowlatency.mga9-1-1.mga9.x86_64, choosing > blindly So, so you narrowed the sleep problems to 1 hour. There are more kernels focused to the problem, here: 1) standard: 6.5.12-0.1.mga9: https://download.copr.fedorainfracloud.org/results/ghibo/mageia9-bonus/mageia-9-x86_64/06671422-kernel/ 2) standard 6.5.12-0.2.mga9: https://download.copr.fedorainfracloud.org/results/ghibo/mageia9-bonus/mageia-9-x86_64/06671423-kernel/ 3) linus 6.5.12-0.1.mga9: https://download.copr.fedorainfracloud.org/results/ghibo/mageia9-bonus/mageia-9-x86_64/06671424-kernel-linus/ 4) LTS series, low latency: 6.1.63: https://download.copr.fedorainfracloud.org/results/ghibo/mageia9-bonus/mageia-9-x86_64/06671886-kernel/ I've also a low latency 6.5.12-4.mga9, but not yet in copr, see first with 1) and 2). Since the testing matrix is growing, keep one fixed driver first, e.g. 535.129.03 and see with kernels.
§ Fail: kernel-desktop-6.4.11-16.lowlatency.mga9-1-1.mga9.x86_64 Now running kernel-desktop-6.1.47-2.lowlatency.mga9-1-1.mga9.x86_64. I will try your list in numeric order. I am keeping to newfeature 545.29.02 as that is what i also was testing when starting the test series.
§ OK: kernel-desktop-6.1.47-2.lowlatency.mga9-1-1.mga9.x86_64 Now going for 1) standard: 6.5.12-0.1.mga9
Currently running your 1) (desktop kernel-linus-6.5.12-0.1.mga9.x86_64.rpm) and have also installed 2) ( -0.2 ) In between i also tried to install 3), but urpmi kernel-linus-6.5.12-0.1.mga9.x86_64.rpm hangs forever using one core, no output at all, have to kill by ctrl-C (I installed the -devel- file separately first) Can you check? ( I also tried downloading it again, and also the -devel-, bith identical as first attempt downloaded files. )
(In reply to Morgan Leijström from comment #6) > Currently running your 1) (desktop kernel-linus-6.5.12-0.1.mga9.x86_64.rpm) > > and have also installed 2) ( -0.2 ) > > In between i also tried to install 3), but > urpmi kernel-linus-6.5.12-0.1.mga9.x86_64.rpm > hangs forever using one core, no output at all, have to kill by ctrl-C > > > (I installed the -devel- file separately first) > > Can you check? To get the dkms built you need to install -devel them at same time (e.g. urpmi ./kernel-xxx.x86_64.rpm ./kernel-devel-xxx.x86_64.rpm or even urpmi https://copr/..../kernel-xxx.x86_64.rpm https://copr/.../kernel-devel-xxx.x86_64.rpm (or using dnf or dnfdragora after having added the copr repo). For 6.5.12-linus, we'll see later. > > ( I also tried downloading it again, and also the -devel-, bith identical as > first attempt downloaded files. ) So kernel-desktop-6.5.12-0.2.mga9 fails too?
§ Fail: kernel-desktop-6.5.12-0.1.mga9-1-1.mga9.x86_64 Now running kernel-desktop-6.5.12-0.2, will see tomorrow if it wakes up. --- For the problem installing linus 6.5.12-0.1.mga9: I know the corresponding -devel- need be installed at the same time (or before) as the kernel. I use to put all in one urpmi command, or "urpmi --no-recommends *" ina afolder with only the files i want to install. I did so also with linus but it hang so to investigate i let it only install -devel- first and that succeeded. Then when i tell urpmi to install kernel-linus-6.5.12-0.1.mga9.x86_64.rpm it immediately consume exactly one CPU core and perform nothing. *very* different from normal. --- When installing lower version, urpmi do not want to downgrade cpupower, kernel-userspace-headers, lib64bpf1. So I guess it is OK to install and use 6.1.63 kernel, with 6.5.12 versions of above three packages? - Or should i optimally force that three downgrades when i intend to fully test 6.1.63 kernel? $ ls cpupower-6.1.63-2.lowlatency.mga9.x86_64.rpm kernel-desktop-6.1.63-2.lowlatency.mga9-1-1.mga9.x86_64.rpm kernel-desktop-devel-6.1.63-2.lowlatency.mga9-1-1.mga9.x86_64.rpm kernel-userspace-headers-6.1.63-2.lowlatency.mga9.x86_64.rpm lib64bpf1-6.1.63-2.lowlatency.mga9.x86_64.rpm $ LC_ALL=C sudo urpmi --no-recommends * Some requested packages cannot be installed: cpupower-6.1.63-2.lowlatency.mga9.x86_64 (in order to keep cpupower-6.5.12-0.2.mga9.x86_64) kernel-userspace-headers-6.1.63-2.lowlatency.mga9.x86_64 (in order to keep kernel-userspace-headers-6.5.12-0.2.mga9.x86_64) lib64bpf1-6.1.63-2.lowlatency.mga9.x86_64 (in order to keep lib64bpf1-6.5.12-0.2.mga9.x86_64) ( I use --no-recommends because else it wants to install kernel-desktop-latest 6.5.11-5 )
6.5.12-0.2 is exected to work at this point... Yes, it's ok to use --no-recommends too. Note that for this test round you don't need to install all the other libraries lib64bpf, cpupower, userspace, etc., to match the kernel, keep them at stock, so you don't need to downgrade later (being difficult to downgrade it's probably a side-effect of new naming scheme of stock kernels). Using just just kernel-desktop+kernel-desktop-devel RPMs won't interfere and can be easily upgraded/downgraded/removed (since they use the old naming scheme) up and down.
(In reply to Giuseppe Ghibò from comment #9) > 6.5.12-0.2 is exected to work at this point... Not tested long sleep yet - we will see tomorrow. > Yes, it's ok to use --no-recommends too. Note that for this test round you > don't need to install all the other libraries lib64bpf, cpupower, userspace OK, proceeding.
§ Fail: kernel-desktop-6.5.12-0.2.mga9-1-1.mga9.x86_64 Now on to desktop-6.1.63-2.lowlatency.
(In reply to Morgan Leijström from comment #11) > § Fail: kernel-desktop-6.5.12-0.2.mga9-1-1.mga9.x86_64 OK, which means it's not the patch "i2c_nvidia_gpu-change-err-into-info.patch" for bug https://bugzilla.kernel.org/show_bug.cgi?id=206653#c19 that we had in kernel stock but not in kernel-linus.
(In reply to Morgan Leijström from comment #6) > Currently running your 1) (desktop kernel-linus-6.5.12-0.1.mga9.x86_64.rpm) > > and have also installed 2) ( -0.2 ) > > In between i also tried to install 3), but > urpmi kernel-linus-6.5.12-0.1.mga9.x86_64.rpm > hangs forever using one core, no output at all, have to kill by ctrl-C > > > (I installed the -devel- file separately first) > > Can you check? > > ( I also tried downloading it again, and also the -devel-, bith identical as > first attempt downloaded files. ) kernel-linus-65.12-0.1.mga9 is the version 6.5.12 plus the stable-queue updated to the day before yesterday. The kernel-linus RPMs signatures are OK, packages intact so the RPM should be OK. However kernel-linus is in the new naming scheme, so it uses multiple version RPM, which is where urpmi has problems with this kind of installations (up/down/remove). Probably the "hangs forever" is a long timeout/slow-down (even more than half an hour). I hadn't time to produce a kernel-linus in the old naming scheme too as conditional build. A tip to bypass this is to "not use" urpmi at all, but bypassing installation calling directly "rpm -i" over the packages, e.g.: rpm -ivh ./kernel-linus-6.5.12-0.1.mga9.x86_64.rpm ./kernel-linus-devel-6.5.12-0.1.mga9.x86_64.rpm and they will install. With "using one core" what do you mean exactly? I tried booting kernel-linus 6.5.12-0.1 using 1 core only, i.e. passing "maxcpus=1" (which is the parameter to enable one core only) to the kernel boot cmdline and it boots ok.
§ OK: desktop-6.1.63-2.lowlatency (In reply to Giuseppe Ghibò from comment #13) > kernel-linus-65.12-0.1.mga9 ... > rpm -ivh > and they will install. Yep, done. > With "using one core" what do you mean exactly? *urpmi* was using one core 100% for several minutes until I hit ctrl C. Now rebooting to linus-65.12-0.1, expecting dkms autorebuild to make nvidia and vbox modules.
§ OK: kernel-linus-6.5.12-0.1.mga9.x86_64 Ready to test more :)
The following test seem to confirm that the problem appear because monitor go in deeper sleep after a while: When I power on the monitor shortly before resuming the system, also with desktop kernels and long suspend time: the login is displayed without needing to power-cycle the monitor. -- Anyway, now I have switched back to kernel-linus-6.5.12-0.1, and testing OK with nvidia 545.29.06
There is also the problem - related or not - that after switching to tty (i.e using ctrl-alt-F4) and then back to desktop, my screen is completely black minus a mouse pointer, frozen. Hard hang - Not even REISUB works - have to cut power. These are the last journal lines from that run: nov 30 23:39:52 svarten.tribun systemd[1]: Started getty@tty4.service. nov 30 23:39:52 svarten.tribun acpid[8158]: client 12149[0:0] has disconnected nov 30 23:39:52 svarten.tribun plasmashell[15710]: org.kde.plasma.pulseaudio: No object for name "alsa_output.pci-0000_00_1b.0.analog-stereo.monitor" nov 30 23:39:52 svarten.tribun plasmashell[15710]: org.kde.plasma.pulseaudio: No object for name "alsa_output.pci-0000_00_1b.0.analog-stereo.monitor" nov 30 23:39:52 svarten.tribun plasmashell[15710]: org.kde.plasma.pulseaudio: No object for name "alsa_output.pci-0000_00_1b.0.analog-stereo" nov 30 23:39:52 svarten.tribun plasmashell[15710]: org.kde.plasma.pulseaudio: No object for name "alsa_output.pci-0000_00_1b.0.analog-stereo.monitor" nov 30 23:39:52 svarten.tribun plasmashell[15710]: org.kde.plasma.pulseaudio: No object for name "auto_null" nov 30 23:39:52 svarten.tribun plasmashell[15710]: org.kde.plasma.pulseaudio: No object for name "auto_null.monitor" nov 30 23:39:52 svarten.tribun plasmashell[15710]: org.kde.plasma.pulseaudio: No object for name "auto_null.monitor" nov 30 23:39:52 svarten.tribun plasmashell[15710]: org.kde.plasma.pulseaudio: No object for name "auto_null.monitor" nov 30 23:40:01 svarten.tribun systemd[1]: Started session-14.scope. nov 30 23:40:01 svarten.tribun CROND[249650]: (morgan) CMD (/usr/bin/nice -n19 /usr/bin/ionice -c2 -n7 /usr/bin/backintime backup-job >/dev/null) nov 30 23:40:01 svarten.tribun wireplumber[15500]: GetManagedObjects() failed: org.freedesktop.DBus.Error.NameHasNoOwner nov 30 23:40:01 svarten.tribun CROND[249645]: (morgan) CMDEND (/usr/bin/nice -n19 /usr/bin/ionice -c2 -n7 /usr/bin/backintime backup-job >/dev/null) nov 30 23:40:02 svarten.tribun python3[249658]: QSettings::value: Empty key passed nov 30 23:40:02 svarten.tribun python3[249658]: QSettings::value: Empty key passed ------ This seem to have become worse since kernels 5.2 an/or nivia drivers update. - a month or so ago i could often switch back and fort a couple times before hang, and sometimes mouse pointer was moveable. Now tested both kernel-desktop-6.5.11-5.mga9.x86_64, and kernel-linus-6.5.11-2.mga9.x86_64, with nvidia 535.129.03, and both failed first try. I only tested one time each, this is a production system...
(In reply to Morgan Leijström from comment #17) > There is also the problem - related or not - that after switching to tty > (i.e using ctrl-alt-F4) and then back to desktop, my screen is completely > black minus a mouse pointer, frozen. Hard hang - Not even REISUB works - > have to cut power. > There is kernel-desktop-6.5.13-2.mga9 in backports testing (just install -desktop -devel, as it's oldnamingscheme) which should fix the VT problem. Not your missed monitor resume after longer suspend, which is not tracked yet.
(In reply to Giuseppe Ghibò from comment #18) > (In reply to Morgan Leijström from comment #17) > > There is kernel-desktop-6.5.13-2.mga9 in backports testing > which should fix the VT problem. I confirm that problem seem to be fixed; tested a few iterations tty4, 6, and 1 (desktop) switching. I will shout if i see it again. Thank you. I will keep running -desktop-6.5.13-2 for a while. Using nvidia 470.223.02-1
Handling of this issue apparently have melded into Bug 31695 As described there: § The vt hang problem have partially reappeared § resume from suspend mostly works for -desktop, not always for -server *** This bug has been marked as a duplicate of bug 31695 ***
Status: NEW => RESOLVEDResolution: (none) => DUPLICATE