Bug 25930 - Graphic driver i915 crashed
Summary: Graphic driver i915 crashed
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: 7
Hardware: All Linux
Priority: Normal major
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
: 26117 26149 (view as bug list)
Depends on: 26202
Blocks:
  Show dependency treegraph
 
Reported: 2019-12-23 16:57 CET by papoteur
Modified: 2020-02-22 01:27 CET (History)
10 users (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments
Journal of the crash (86.99 KB, text/plain)
2019-12-23 16:58 CET, papoteur
Details
Dump core of the crash (4.78 KB, text/plain)
2019-12-23 16:59 CET, papoteur
Details
Journal of a session with experimental kernel (505.16 KB, application/x-xz)
2019-12-28 23:06 CET, papoteur
Details
/sys/class/drm/card0/error after a GPU crash (4.49 KB, text/plain)
2020-01-06 20:36 CET, Alain Choucroot
Details
GPU crash dump (4.77 KB, text/plain)
2020-01-13 21:05 CET, Frédéric "LpSolit" Buclin
Details

Description papoteur 2019-12-23 16:57:43 CET
Description of problem:
When I wanted to wake up screens, no success, even if I got mouse cursor moving.
After a while, I can commute to tty3.
I see in journal that the graphical driver crashed.
I had to reboot.

Environment: LXQt, using kwin
xscreensaver
Card:Intel 810 and later: Intel Corporation|HD Graphics 620 [DISPLAY_VGA] (vendor:8086 device:5916 subv:1043 subd:16e0) (rev: 02)

One internal monitor, one external monitor connected through HDMI.
Comment 1 papoteur 2019-12-23 16:58:26 CET
Created attachment 11430 [details]
Journal of the crash
Comment 2 papoteur 2019-12-23 16:59:39 CET
Created attachment 11431 [details]
Dump core of the crash
Comment 3 papoteur 2019-12-23 17:01:59 CET
uname -a
Linux YZenbook.local 5.4.2-desktop-1.mga7 #1 SMP Thu Dec 5 17:40:00 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

rpm -qa --last |grep intel
x11-driver-video-intel-2.99.917-57.mga7.x86_64 lun. 04 nov. 2019 16:58:55 CET
lib64drm_intel1-2.4.100-1.mga7.x86_64         lun. 04 nov. 2019 15:51:03 CET
vaapi-driver-intel-2.3.0-2.mga7.x86_64        lun. 08 juil. 2019 08:03:39 CEST
intel-gpu-tools-1.23-3.mga7.x86_64            lun. 08 juil. 2019 07:50:20 CEST
Comment 4 Lewis Smith 2019-12-23 20:36:43 CET
Thanks for this report, Yves; and the evidence attached. Can you give a bit of background:
- did this happen on first use of the system (hence always since) ?
- if not, is it frequent ? Or occasional ? Or one-off ?

Assigning to the kernel/drivers group.

Assignee: bugsquad => kernel

Comment 5 papoteur 2019-12-23 22:35:18 CET
Hello Lewis,
This started today, with no apparent reason (non new parameter, no updates).
It seems to be linked to screen saver or something which happens at the same time.
Comment 6 papoteur 2019-12-27 21:31:16 CET
Hello,
I found a recent bug report which seems similar.
https://gitlab.freedesktop.org/drm/intel/issues/673
It seems that a patch is needed against intel driver.
Alain Choucroot 2019-12-27 21:43:15 CET

CC: (none) => choucroot

Comment 7 Alain Choucroot 2019-12-27 21:54:12 CET
Hello, 

  same issue since kernel 5.4.2. frequent and unpredictable freezing.
Here some kernel dmesg logs:

[12651.016186] i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
[12651.016189] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
...
...
[12651.016193] GPU crash dump saved to /sys/class/drm/card0/error


Unfortunately, the dump is empty

# more /sys/class/drm/card0/error
No error state collected
Comment 8 Thomas Backlund 2019-12-27 22:19:47 CET
(In reply to papoteur from comment #6)
> Hello,
> I found a recent bug report which seems similar.
> https://gitlab.freedesktop.org/drm/intel/issues/673
> It seems that a patch is needed against intel driver.

Thanks for the pointer, I will add it to next kernel build

CC: (none) => tmb

Comment 9 Thomas Backlund 2019-12-28 15:03:32 CET
Does kernel-desktop-5.4.6-2.1.mga7 from:

http://ftp.free.fr/mirrors/mageia.org/people/tmb/mga7/bugs/25930/

work any better ?
Comment 10 Alain Choucroot 2019-12-28 17:40:07 CET
Hello.
On my machine with this kernel, I notice the following traces before lightdm :

i915 :Failed to idle engines, declaring wedged!
i915 :Failed to initialize GPU, declaring it wedged!

When login into xfce, it is a mess : xfce panel doesn't work properly, the multi- workspace feature is just one desk where the graphical windows doesn't have a control bar anymore ( with stop button, maximize, ....)
Comment 11 papoteur 2019-12-28 23:06:33 CET
Created attachment 11435 [details]
Journal of a session with experimental kernel

On my side, the experimentation is not good neither.
Using LXQt and kwin, it seems kwin restarts several times until it said that too many crashes occurred.
on a session, I saw a message inviting to add drm.debug on kernel line. This is the journal from the session with the drm.debug enabled.
Comment 12 Thomas Backlund 2020-01-01 17:50:19 CET
There is now a kernel-5.4.7-1.mga7 in updates_testing that has an updated version of the fix sent by the Intel devs, so please try and see if that works any better
Comment 13 Alain Choucroot 2020-01-01 20:37:01 CET
Hello, 
very satisfying update. No hand since nearly 2 hour, duration I could never achieve with 5.4.6. Even if I din't have a precise reproductible scenario, I did lot of tasks that usually ended in hanging with 5.4.6. Here the system is stable.

[afb@localhost Bureau]$ uname -r
5.4.7-desktop-1.mga7
[afb@localhost Bureau]$ w
 20:31:10 up  1:56,  1 user,  load average: 0,29, 0,98, 0,85
UTIL.    TTY        LOGIN@   IDLE   JCPU   PCPU QUOI
afb      tty1      18:34    1:56m  3:13   1.01s xfce4-session

Bravo !
Comment 14 Alain Choucroot 2020-01-02 10:06:52 CET
Hello.

 I could notice a little 10 ms freeze than back to normal. During watching a video. But no hang this time ! Here are the logs from dmesg: 

[ 6410.379742] i915 0000:00:02.0: Resetting rcs0 for stuck wait on rcs0

Maybe it has nothing to do about the fix, but I give you the information.
Comment 15 Alain Choucroot 2020-01-06 20:36:13 CET
Created attachment 11446 [details]
/sys/class/drm/card0/error after a GPU crash
Comment 16 Alain Choucroot 2020-01-06 20:40:43 CET
Unfortunately, new crashs under 5.4.7-desktop-1.mga7 .
This time, only occured while surfing with the "Falkon" navigator.
The 3rd time, weird but it came back to life after a few seconds. Thus the file /sys/class/drm/card0/error is accessible. I added it in the attachments.
Comment 17 Thomas Backlund 2020-01-13 17:59:52 CET
An update for this issue has been pushed to the Mageia Updates repository.

https://advisories.mageia.org/MGASA-2020-0036.html

Status: NEW => RESOLVED
Resolution: (none) => FIXED

Comment 18 Frédéric "LpSolit" Buclin 2020-01-13 21:00:30 CET
Reopening, because comment 11 and comment 16 say that the problem is still no fixed, and I have the exact same problem with kernel 5.4.6 and 5.4.10 (cannot reproduce with 5.3.13), see bug 26049 comment 20 and bug 26049 comment 22.

Status: RESOLVED => REOPENED
CC: (none) => LpSolit
Resolution: FIXED => (none)

Comment 19 Frédéric "LpSolit" Buclin 2020-01-13 21:05:26 CET
Created attachment 11456 [details]
GPU crash dump

I attached the output of /sys/class/drm/card0/error. Moreover, dmesg prints:

[ 3442.830245] SUPR0GipMap: fGetGipCpu=0xb
[ 3443.390048] vboxdrv: 000000003c425d44 VMMR0.r0
[ 3443.470930] vboxdrv: 00000000bf204d81 VBoxDDR0.r0
[ 3443.555065] vboxdrv: 0000000004369794 VBoxEhciR0.r0
[ 3915.972528] i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0
[ 3915.972529] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 3915.972530] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 3915.972530] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 3915.972530] The GPU crash dump is required to analyze GPU hangs, so please always attach it.
[ 3915.972531] GPU crash dump saved to /sys/class/drm/card0/error
[ 3915.973535] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[ 3915.974263] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[ 3915.974402] i915 0000:00:02.0: Resetting chip for hang on rcs0
[...]
[ 3941.957842] i915 0000:00:02.0: Resetting rcs0 for stuck wait on rcs0
[ 3955.974046] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[ 3955.974841] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[ 3955.974941] i915 0000:00:02.0: Resetting chip for hang on rcs0
[ 3955.976782] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[ 3955.977564] [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}
[ 4093.959834] i915 0000:00:02.0: Resetting rcs0 for stuck wait on rcs0
[...]
Comment 20 Frédéric "LpSolit" Buclin 2020-01-13 21:41:51 CET
(In reply to Thomas Backlund from comment #12)
> There is now a kernel-5.4.7-1.mga7 in updates_testing that has an updated
> version of the fix sent by the Intel devs, so please try and see if that
> works any better

Per https://gitlab.freedesktop.org/drm/intel/issues/673#note_382292, no patch works with kernel 5.4 yet.
Comment 21 Frédéric "LpSolit" Buclin 2020-01-13 21:48:18 CET
Per https://bugs.freedesktop.org/show_bug.cgi?id=111970#c35, we need to use drm-tip to fix this problem: https://cgit.freedesktop.org/drm-tip Is this the case?
Comment 22 Frédéric "LpSolit" Buclin 2020-01-13 21:54:17 CET
(In reply to Frédéric "LpSolit" Buclin from comment #21)
> Per https://bugs.freedesktop.org/show_bug.cgi?id=111970#c35, we need to use
> drm-tip to fix this problem: https://cgit.freedesktop.org/drm-tip Is this
> the case?

It's not. Mageia 7 has lib64drm_intel1-2.4.100-1.mga7, which has been packaged on Oct 17.
Comment 23 Thomas Backlund 2020-01-13 22:31:35 CET
drm-tip is upstream kernel tip tree for drm subsystem, not libdrm
Comment 24 Thomas Backlund 2020-01-13 22:36:27 CET
And no, I wont pull in drm-tip in a stable release


(In reply to Frédéric "LpSolit" Buclin from comment #20)
> (In reply to Thomas Backlund from comment #12)
> > There is now a kernel-5.4.7-1.mga7 in updates_testing that has an updated
> > version of the fix sent by the Intel devs, so please try and see if that
> > works any better
> 
> Per https://gitlab.freedesktop.org/drm/intel/issues/673#note_382292, no
> patch works with kernel 5.4 yet.

Well, the v2 of the patch we have atleast makes it crash/hang less often...

Lets hope Intel guys can sort this out for stable trees too
Comment 25 Jim Beard 2020-01-20 16:23:00 CET
To start with my conclusion, I have had what I now think is this problem in Gnome 3 since the 5.3.13-2 kernel, intel i915 driver. With the 5.4.12-1 kernel, logging in as a test user and then on the command line using su - mylogin has given me access to everything set up in my $HOME but reduced the frequency and intensity of problems greatly.  Timeout on GPU recovery remains a problem for the i915 driver, which complains of it.

Just using a test user account had improved things so much I first thought the problem had gone away, but it recurred, though much diminished.  I have a lot of stuff set up (languages, multiple browsers for special purposes, privoxy, geneweb genealogy server, etc) which was one reason I upgraded to Mageia 7 rather than do a clean install, and rather than add all that to the test user account I tried the su - mylogin to see if that could provide at least limited functional use of my regular user account.

I have been surprised at how well things have worked.  It appears the GPU recovery and rcs0 hang may be intensified by cruft in $HOME configuration files, with the Brave web browser the prime example at the moment.  I have almost given up using Brave, but may try again using the testlogin to su - mylogin or possibly mylogin to su - testlogin and then to su - mylogin to get full access to my accounts capabilities.

CC: (none) => jim.beard

Comment 26 Dave Hodgins 2020-01-29 21:28:41 CET
*** Bug 26149 has been marked as a duplicate of this bug. ***

CC: (none) => antonin.roussel

Comment 27 Dave Hodgins 2020-01-29 21:30:09 CET
*** Bug 26117 has been marked as a duplicate of this bug. ***

CC: (none) => nicolas

Comment 28 Antonin Roussel 2020-01-31 16:52:21 CET
Hello,

On my server, a similar desktop freeze occurs almost once a day : Bug 26149. I manage to log on it through a x2go session*, or ssh. Maybe an idea to grab information ... (I don't know which one !)
* Could not start my firefox browser in x2go session, nor kill the first firefox browser instance which was running in the frozen session (it remains several 'Web Content' defunct processes around).

By the way I was wondering if there is a simple way to kill the whole frozen session from this ssh.
claire robinson 2020-02-05 15:27:40 CET

CC: (none) => eeeemail

Comment 29 claire robinson 2020-02-05 16:03:29 CET
A bit late but you might try loginctl..
https://freedesktop.org/software/systemd/man/loginctl.html

eg.
# loginctl list-sessions
SESSION  UID USER    SEAT  TTY
     c2 1000 username seat0 

# loginctl terminate-session c2

Experiencing this issue too, relieved by using 
kernel-desktop-5.3.13-2.mga7-1-1.mga7
with matching -devel for now.
Comment 30 Jan Smout 2020-02-06 10:22:29 CET
Hi everyone,

this bug report refers to a number of serious issues with the i915 module.


Personally I followed up on kernels since beta 1 (4.19.10), but I didn't notice abnormal gpu hangs until the 5.4.x series. I did have some crashes before, but couldn't pinpoint the root cause as I was also developing an OpenGL application which had bugs on its own.

A good summary of the issues involved can be found here:

https://linuxreviews.org/Kernel_5.4.1_And_5.3.14_Are_Released_Making_Linux_Users_With_Intel_iGPUs_Finally_Able_To_Use_5.3-Series_Kernels

Based on that I reverted to 5.0.7 and 5.3.13 for testing. They are running currently very stable on 2 systems  (i5 and i7 coffee lake). The 5.3.13 I will probably not use because of the nasty set_page_dirty (supposedly fixed in 5.3.14).

A good way to trigger the GPU hang is to use gdkgears (gdk3 has rather large performance problems which causes high CPU load, which in turn triggers the GPU hang more easily). Keep an ssh session open to kill it


The article also says:

"Going back to 5.0.21 or updating to 5.5 when it is released are viable solutions"

There is also this post:
https://linuxreviews.org/Linux_Kernel_5.5_Will_Not_Fix_The_Frequent_Intel_GPU_Hangs_In_Recent_Kernels

which basically says that the i915 will not be ready for the show in 5.5 either


That leaves us with 5.0

Unfortunately, the 5.0.7 was only available in beta 3. I still have a copy, but I don't seem to find it in the repo anymore.


So, until a final solution is released, hereby I formally request to add 5.0.7 again to the repository. Or perhaps even a new 5.0.21 build?

CC: (none) => smout.jan

Comment 31 Antonin Roussel 2020-02-06 12:41:58 CET
Hi,
killing frozen session through distant ssh allowed me to open a new session from usual login screen. But this new session was very very slow, despite of low CPU and low memory use. So I ended rebooting, in the clean way. 
Next time, I will give a look to gdkgears.
(thank you for tools advices)
Comment 32 Jan Smout 2020-02-06 13:00:47 CET
(In reply to Antonin Roussel from comment #31)
> Hi,
> killing frozen session through distant ssh allowed me to open a new session
> from usual login screen. But this new session was very very slow, despite of
> low CPU and low memory use. So I ended rebooting, in the clean way. 
> Next time, I will give a look to gdkgears.
> (thank you for tools advices)

In the beginning I was even using the alt-sysrq keys out of frustration :-O
The terminal is very slow indeed, but when you already know which process to kill it becomes easier.
If you don't know who is causing the hang then all there is left is a reboot :-/
ps : gdkgears is part of the gtk development suite. Checkout from the git repository and compile...
Comment 33 w unruh 2020-02-08 18:49:32 CET
I am getting the freezes and stettering (typing into the terminal, the screen will freeze for 1 to 5 sec ) on a 5.1.4 kernel. A quick look at the logs does not really show anything.

As an additional possible symptom, the latest (day before yesterday) google chrome update stopped it being able to show movies from tv channels. Some (gem.cbc.ca, globeltv.ca) show a black screen with no sound, although the thumbnail images when I put the cursor on the timeline on the movie do show pictures. Reverting to an earlier version of chrome and things work again, so it is not clear if this is related to this bug.

CC: (none) => unruh

Comment 34 Jan Smout 2020-02-08 21:40:52 CET
(In reply to w unruh from comment #33)
> I am getting the freezes and stettering (typing into the terminal, the
> screen will freeze for 1 to 5 sec ) on a 5.1.4 kernel. A quick look at the
> logs does not really show anything.
> 
> As an additional possible symptom, the latest (day before yesterday) google
> chrome update stopped it being able to show movies from tv channels. Some
> (gem.cbc.ca, globeltv.ca) show a black screen with no sound, although the
> thumbnail images when I put the cursor on the timeline on the movie do show
> pictures. Reverting to an earlier version of chrome and things work again,
> so it is not clear if this is related to this bug.


When the kernel is at a constant speed and it is experiencing sudden freezes, the acceleration causes black video radiation in chrome :-D
Sorry, couldn't resist the temptation. Not every day we get the occasion to greet a world class physicist. Pleased to meet you, even when it is online...

Getting back on topic:
If you are using intel graphics then the freezing might be related, but I wasn't aware of problems with video playback. You might be seeing 2 different things here. If reverting the chrome version helped then at least that one had nothing to do with the i915 module.
Did the freezing also go away?
Comment 35 w unruh 2020-02-08 23:01:22 CET
Good to meet you as well. 

Anyway, yes, I agree that the fact that the version of chrome which is affected is certainly a hint that it has nothing to do with the bug of this thread. However, my systems are Intel graphics with the i915 module, and the Chrome bug must be relatively rare since otherwise they would not have released the new version-- and perhaps that "rarity" is associated with the Intel graphics. Ie, the bug in Chrome is tickling the same bug in the Intel graphics. I reported it because of the association and the remote possibility that the bugs are related.

Anyway, this stuttering and temporary freezing are getting very annoying. The past couple of times it froze I did alt-ctrl-F3 and then Alt-ctl-F1 to get back to X and the freezing had been thawed. 
I could not see anything in the kernel messages (alt-ctrl-F12) which could explain the problem. The problem is that this bug  is making Mageia 7 on Intel graphics almost unusable. 

Note also my problems are on kernel 5.1.4 not the 5.3 or 5.4 that others are running.
Comment 36 w unruh 2020-02-09 02:56:54 CET
Sorry not to have said. I use Plasma as my DE.
Comment 37 Jan Smout 2020-02-09 22:19:15 CET
(In reply to w unruh from comment #35)
> I reported it because of the association and the remote
> possibility that the bugs are related.
I suspect some race condition to be the culprit. From what I've seen, an application with relative high cpu usage and using the gpu for hardware accelerated drawing will trigger a freeze.
I thought that chrome was not using hardware accelerated video playback in Linux, but maybe they had a change of heart in the last update. That could explain both 'black boxes' and triggering freezes. But that's just a wild guess.

> Anyway, this stuttering and temporary freezing are getting very annoying.
> The past couple of times it froze I did alt-ctrl-F3 and then Alt-ctl-F1 to
> get back to X and the freezing had been thawed. 
> I could not see anything in the kernel messages (alt-ctrl-F12) which could
> explain the problem.
The tty loses track of previous pages the moment you switch the console (there is no Pg Up). A complete logging can be retrieved via the systemd journal. Here is how to check for i915 messages:
   journalctl -a | grep i915
Add '-b' if you're only interested in messages from the last boot.

> The problem is that this bug  is making Mageia 7 on
> Intel graphics almost unusable. 
Couldn't agree more. I'm waiting for the mga kernel maintainer to upload a 5.0.x kernel for the intel users. I have one on 1 system, but I'm missing the rpm or src.rpm to install other machines.

> Note also my problems are on kernel 5.1.4 not the 5.3 or 5.4 that others are
> running.
Yeah I saw that. I did run a 5.1.4 in June last year, but I wasn't doing any video playback in chrome, nor was I using it as my main development machine. And my system logs don't go that far to do a post mortem...
Comment 38 claire robinson 2020-02-09 23:52:53 CET
When I encountered this, it was using a usb oscilloscope. I'd not encountered any issue with 5.4 kernels until that point. I very rarely play games but suppose that would have triggered it before now.

It's been absolutely fine with kernel 5.3.13, which was just before the changes were introduced upstream. It seems they've been wrestling with it ever since.
Comment 39 Jim Beard 2020-02-10 16:17:29 CET
On my main machine, opera browser has been particularly bad about crashing with associated rcs0 hanging, but I have found I can use ssh to log in from my backup machine and the problem seems not to occur.  The initial response when opera is launched via ssh may be of interest, due to the gpu-process multiple threads error message.

To repeat, this works.  Trying to use opera after logging in directly to the machine crashes frequently.

ssh -l me mainmachine
[me@mainmachine ~]$ opera &
[1] 31858
[me@mainmachine ~]$ ATTENTION: default value of option vblank_mode overridden by environment.
[31890:31890:0210/100529.768926:ERROR:sandbox_linux.cc(369)] InitializeSandbox() called with multiple threads in process gpu-process.
[32002:32004:0210/100530.222585:ERROR:nss_util.cc(750)] After loading Root Certs, loaded==false: NSS error code: -8018
[32069:1:0210/100530.727697:ERROR:child_thread_impl.cc(864)] Receiver for unknown Channel-associated interface: chrome.mojom.SearchBouncer
[32100:1:0210/100530.754528:ERROR:child_thread_impl.cc(864)] Receiver for unknown Channel-associated interface: chrome.mojom.SearchBouncer
[31858:31876:0210/100530.915830:ERROR:nss_util.cc(750)] After loading Root Certs, loaded==false: NSS error code: -8018
Comment 40 Thomas Backlund 2020-02-12 16:26:52 CET
Note that there is now a kernel-5.5.3-1 in updates_testing you can try...

Note that I havent fixed virtualbox for 5.5 series yet...
Comment 41 Jan Smout 2020-02-12 16:33:07 CET
(In reply to Thomas Backlund from comment #40)
> Note that there is now a kernel-5.5.3-1 in updates_testing you can try...
> 


Thank you Thomas. I will try that tomorrow.
But I am a bit pessimistic as indicated here: 
https://linuxreviews.org/Linux_Kernel_5.5_Will_Not_Fix_The_Frequent_Intel_GPU_Hangs_In_Recent_Kernels
Comment 42 Thomas Backlund 2020-02-12 19:32:39 CET
(In reply to Thomas Backlund from comment #40)
> Note that there is now a kernel-5.5.3-1 in updates_testing you can try...
> 
> Note that I havent fixed virtualbox for 5.5 series yet...

Actually, you might want to wait for the next build...

Upstream has identified 2 missing crash fixes in 5.5 (I already have some others added)
Comment 43 Thomas Backlund 2020-02-15 13:16:54 CET
There is now a  kernel-5.5.4-1.mga8 in testing
Comment 44 claire robinson 2020-02-15 15:10:35 CET
Mga7 too.


tmb <tmb> 5.5.4-1.mga7:
+ Revision: 1525734
- drm/i915: Serialise i915_active_acquire() with __active_retire()


Will give it a try.
Comment 45 Thomas Backlund 2020-02-15 15:45:26 CET
Yeah, I meant the .mga7 one :)
Comment 46 claire robinson 2020-02-15 21:05:13 CET
Still waiting for my mirror to sync. It's about 8 hours behind at the moment.

It might have to be tomorrow now.
Comment 47 Thomas Backlund 2020-02-15 21:06:50 CET
Yeah, the mga8 distro rebuild makes mirroring slow
Comment 48 claire robinson 2020-02-16 17:44:15 CET
I've been using the 5.5.4-1 kernel this afternoon, doing the same things I was before, when I ran in to the problem, and not had a recurrence.

I also tried suspend/resume and attempting the same again. So far so good.

I've watched some youtube at the same time, which should use the card for decoding IINM and played with glxgears and teapot. Also allowed the screen to dim and blank before reviving it, with no ill effects.

I've followed the journal throughout and so far not one mention of it.

$ uname -a
Linux localhost.localdomain 5.5.4-desktop-1.mga7 #1 SMP Sat Feb 15 08:41:16 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ head /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 142
model name      : Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
stepping        : 9
microcode       : 0xca
cpu MHz         : 2364.522
cache size      : 3072 KB
physical id     : 0

$ lspcidrake -vvv | grep ^Card
Card:Intel 810 and later: Intel Corporation|HD Graphics 620 [DISPLAY_VGA] (vendor:8086 device:5916 subv:103c subd:8215) (rev: 02)
Comment 49 Jan Smout 2020-02-18 14:06:40 CET
ok. Finally installed 5.5.4-1 and played around stress testing with gdkgears + other opengl app. Normally that would make this system hang in under a minute. Am now at 1.5 hours.

No hangs, no logs in the journal. So far so good ^_^

I'll let it run now for a couple of days and will report back on Thursday when I will be using an additional machine.

# uname -a
Linux temp7 5.5.4-desktop-1.mga7 #1 SMP Sat Feb 15 08:41:16 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

# head /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 158
model name      : Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz
stepping        : 10
microcode       : 0xca
cpu MHz         : 2100.038
cache size      : 9216 KB
physical id     : 0

# lspcidrake -vvv | grep ^Card
Card:Intel 810 and later: Intel Corporation|UHD Graphics 630 (Desktop) [DISPLAY_VGA] (vendor:8086 device:3e92 subv:1028 subd:085a)
Frédéric "LpSolit" Buclin 2020-02-19 00:47:58 CET

Depends on: (none) => 26202

Comment 50 Frédéric "LpSolit" Buclin 2020-02-19 00:50:06 CET
Upgrading the kernel to 5.5.4-desktop-1.mga7 also fixes the problem for me.
Comment 51 Rubén Fernández 2020-02-19 20:44:09 CET
I also upgraded the kernel and so far in the last two days haven't got any hang. Videoconferencing with Chromium would always freeze Mageia 7, now it's working.

CC: (none) => ruben33en-mandriva

Comment 52 Jan Smout 2020-02-20 17:03:35 CET
also the other machine behaves correctly. Have been working with it all day without a glitch. The journal has no trace of troubles...

# uname -a
Linux escher 5.5.4-desktop-1.mga7 #1 SMP Sat Feb 15 08:41:16 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

# head /proc/cpuinfo 
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 158
model name      : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
stepping        : 10
microcode       : 0xca
cpu MHz         : 4298.168
cache size      : 12288 KB
physical id     : 0

# lspcidrake -vvv | grep ^Card
Card:Intel 810 and later: Intel Corporation|UHD Graphics 630 (Desktop) [DISPLAY_VGA] (vendor:8086 device:3e92 subv:1043 subd:8694)


Note: i915 loads firmware i915/kbl_dmc_ver1_04.bin (v1.4) from kernel-firmware-nonfree. I have currently the released one : kernel-firmware-nonfree-20191220-1

But there is another one in updates_testing : kernel-firmware-nonfree-20200121

Unsure if this can have an influence. Will keep it in mind when it hits the updates release
Comment 53 Thomas Backlund 2020-02-22 00:11:26 CET
An update for this issue has been pushed to the Mageia Updates repository.

https://advisories.mageia.org/MGAA-2020-0059.html

Resolution: (none) => FIXED
Status: REOPENED => RESOLVED

Comment 54 claire robinson 2020-02-22 01:27:31 CET
Thank you Thomas

Note You need to log in before you can comment on or make changes to this bug.