Mageia 6, x86_64. Description of problem: New Ryzen 7 processor / motherboard system. Since getting it, the system crashes at random times, but 99% of the time during the night (when it gets very light use). For a while because of the crash, syslog has produced nothing because all logging stops when it locks up, whatever was displayed on the screen stays, no other processes happen once the system locks up. System is very stable during the daytime when I'm using it. Version-Release number of selected component (if applicable): kernel-desktop-4.14.30-3.mga6 How reproducible: Just use the system as normal. Processor; Ryzen7 1800x Motherboard: CROSSHAIR VI HERO Syslog crash information: https://pastebin.com/A00Hxm2F
Hi DariuszSki Please attach your logs instead of giving a link to them :-) Thanks! Marja
Assignee: bugsquad => kernelCC: (none) => marja11
Created attachment 10165 [details] Crash text from syslog The extract from syslog which indicates what happened to cause a system crash. 99% of the time the computer crashes with nothing in syslog to indicate what happened.
Just to add, the system is NOT overclocked, and am using the latest BIOS for the motherboard (as the log text shows).
(In reply to DariuszSki from comment #0) > whatever was displayed on the screen stays So when you leave the system for the night, you could open a terminal window and in that become root, and issue journalctl -f, so it write the log in the window until it hangs. > Version-Release number of selected component (if applicable): > kernel-desktop-4.14.30-3.mga6 4.14.40 is in updates_testing repo, you could try if that works better.
CC: (none) => fri
I have installed the latest Kernel you suggest, which has just showed up, I will let the system run as normal, and see if the new Kernel does anything different.
After a number of reboots and trying to force things over a number of days, the newest kernel seems to have fixed the problem with the processor stall. If it happens again I'll re-open the bug.
Resolution: (none) => FIXEDStatus: NEW => RESOLVED
Report was originally for: kernel-desktop-4.14.30-3.mga6 Latest version kernel still affected: kernel-desktop-4.14.44-2.mga6 I am re-opening this bug as the system lockup is still happening. After the original bug report, there was a kernel update, that one did have one lockup / crash, but I didn't report it because the next day there was a new kernel (kernel-desktop-4.14.44-2.mga6) .. the latest kernel. After leaving the machine on 24/7, it managed to get to 3 days 15 hours before it fell over, and as always during the night when the pc is doing very little actual work. When running the command you suggested "journalctl -f" in konsole, it seems to be just enough work for the processor to do, to not fall over during the night. But when I don't run it during the night (as last night), the processor locked up the machine, there is nothing in /var/log/syslog to show what caused the lockup. I am assuming it was a processor stall, as I had manged to write in Comment #2. It's impractical to keep using "journalctl -f" to keep a machine running during the night. Any other suggestions to see what's going on?
Source RPM: kernel-desktop-4.14.30-3.mga6 => kernel-desktop-4.14.44-2.mga6Status: RESOLVED => REOPENEDResolution: FIXED => (none)
Possibly all kinds of file writing to disk or network do not get flushed when it hang. I do not know how to do it (never tried), but use another PC to log in and run "journalctl -f", so you get the output on another machine directly? (hoping that do not prevent it from crashing, so we can log the problem) Not a solution, but a workaround, may be to give it some work. Sleeping silicon is a waste ;) I let my workstation run BOINC practically always.
I see kernel-linus-4.14.48 got built a couple hours ago, mga6 updates_testing...
This bug was reported as for kernel kernel-desktop-4.14.30-3.mga6, but it has affected ALL kernel updates since, multiple random crashes, and am currently on the latest kernel kernel-desktop-4.14.70-2.mga6.
Source RPM: kernel-desktop-4.14.44-2.mga6 => kernel-desktop-4.14.70-2.mga6
Created attachment 10390 [details] Edited syslog of latest crash Most crashes left no information in syslog, or nothing useful. However, this log taken from syslog this morning after another crash seems the most detailed. It logged everything from the crash until I got up and rebooted the machine some hours later. It was repeating the same information over and over during this period into syslog. I've attached a shortened version. Motherboard is using the latest BIOS, although there's been no updates to it in about three months.
CC: (none) => linuxstuff
I have a newer (Intel) machine that would crash overnight. I removed a (suspect) nvme card from the hot m.2 slot on the back of the motherboard and have not had that crash since. The partition on that drive was not even mounted. Long shot. ;)
CC: (none) => rolfpedersen
(In reply to Rolf Pedersen from comment #12) > I have a newer (Intel) machine that would crash overnight. I removed a > (suspect) nvme card from the hot m.2 slot on the back of the motherboard and > have not had that crash since. The partition on that drive was not even > mounted. Long shot. ;) I haven't seen anything to indicate temperatures go silly. During the day when the system is in use, graphics card is ok, and the CPU is water cooled and I don't feel any real heat from the system. Hard drives are in fan airflow, so they should never get hot. Memory passes error checks I've performed. SMART isn't showing any problems with the hard drives.
Created attachment 10404 [details] Another crash that left some information behind Syslog entries for another crash that left some information behind on what happened, syslog usually leaves nothing of what happened.
After many months, a new BIOS was released, and the last two kernels (latest one 4.14.104-desktop-2.mga6) have had the computer appear very stable, it has not decided to lock up at all. I will keep an eye on the system, will close this bug as "Fixed", but will re-open of I need to. Thanks.
Resolution: (none) => FIXEDStatus: REOPENED => RESOLVED