Bug 27833

Summary: Some AMD Radeon cards with kernel 5.10.1 can result of non working display
Product: Mageia Reporter: Brian Rockwell <brtians1>
Component: RPM PackagesAssignee: Kernel and Drivers maintainers <kernel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: Normal CC: fri, joequant, mageia, office, ouaurelien
Version: Cauldron   
Target Milestone: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Source RPM: kernel-desktop-5.10.1-1.mga8-1-1.mga8.x86_64 CVE:
Status comment: fixed in 5.10.2 and updated nonfree firmware
Attachments: DMESG file
Journal File
screen shot
boot stage
boot log with nomodeset and vga=3D4 options
Journal file with amdgpu.dc=0 set

Description Brian Rockwell 2020-12-15 18:14:40 CET
Description of problem:  After getting kernel-desktop-5.10.1-1.mga8-1-1.mga8.x86_64 push system will no longer boot past grub2 menu


Version-Release number of selected component (if applicable): MGA8 Beta2 - 12/15 push


How reproducible:  Hardware  AMD A6 laptop with APU (R4 graphics).  Install requisite CPUPOWER and Kernel and reboot.  System will not get past grub2 menu.  Switched back to 5.9 kernel and it works


Steps to Reproduce:
1.  Install Kernel 5.10.1
2.  Reboot and choose default grub2 item
3.  Stare at blank screen
Comment 1 Brian Rockwell 2020-12-15 18:15:13 CET
Created attachment 12086 [details]
DMESG file
Comment 2 Brian Rockwell 2020-12-15 18:15:48 CET
Created attachment 12087 [details]
Journal File
Comment 3 Thomas Backlund 2020-12-15 18:26:39 CET
no references what so ever of even booting 5.10.1 in the logs.

but I see it had some problems already with 5.9.12 with amdgpu

 RIP: 0010:dc_link_set_backlight_level+0x8a/0xf0 [amdgpu]


can you remove "splash silent" from kernel command line when you boot 5.10 to maybe catch some output
Comment 4 Brian Rockwell 2020-12-15 20:20:51 CET
tried it.  No luck, flashes by way to fast. 

I tried the parameter amdgpu.noretry=0  that didn't fix it either.
Comment 5 Thomas Backlund 2020-12-15 20:34:15 CET
ok, so then it does not seem to crash atleast on boot.

can you after it has booted to black screen, either try to ssh into the machine ?
(remember to install ssh server and open firewall port (or disable firewall temporarily))

or after it reached the black screen, wait a bit, reboot into working kernel and grab the last boot journal with:

journalctl -b -1 > bootlog 

and attach it here...
Comment 6 Lewis Smith 2020-12-15 20:51:34 CET
Nothing I can try here, Grubless! My own EFI M8 system with 5.10.1-desktop-1.mga8 boots fine via rEFInd (kernel stub).

CC'ing Aurélien, who I think uses Grub2.

CC: (none) => ouaurelien

Comment 7 Aurelien Oudelet 2020-12-15 21:20:06 CET
@Brian, 

Please edit kernel 5.10.1 command line and add "systemd.unit=multi-user.target", remove "splash quiet". Press F100 to boot. Don't add the quote.

This will boot your system to console mode, without graphical session.
We must know when the process explode on your system.

Feel free to take a picture of your monitor if boot freezes.
Add here or send me by mail.
Comment 8 Brian Rockwell 2020-12-16 04:22:45 CET
Hi,
Tried this with 5.10.1 and it did not work, same crash out to blank screen.

I change the command line in 5.9.12 and get to terminal.  If I really really screw up the command line I get to a Kernel panic.  I'll attach the photo, but I think it is just me flummoxing up the system so bad it doesn't know what is left and right.

Looking up notes on Kernel 5.10.1, it does seem that AMD slipped in some APU/GPU changes into the kernel that might not be fully vetted, at least that's what I interpreted in it.

I'll try and attach the complete flummoxed screen print.
Comment 9 Brian Rockwell 2020-12-16 04:25:51 CET
Created attachment 12089 [details]
screen shot
Comment 10 Aurelien Oudelet 2020-12-16 07:54:26 CET
(In reply to Brian Rockwell from comment #8)
> Hi,
> Tried this with 5.10.1 and it did not work, same crash out to blank screen.
> 
> I change the command line in 5.9.12 and get to terminal.  If I really really
> screw up the command line I get to a Kernel panic.  I'll attach the photo,
> but I think it is just me flummoxing up the system so bad it doesn't know
> what is left and right.
> 
> Looking up notes on Kernel 5.10.1, it does seem that AMD slipped in some
> APU/GPU changes into the kernel that might not be fully vetted, at least
> that's what I interpreted in it.
> 
> I'll try and attach the complete flummoxed screen print.

All I see in this kernel panic is that it can't find the root device.
This is really strange and I'd rather think a bad initrd image.
Comment 11 Aurelien Oudelet 2020-12-16 11:35:46 CET
*** Bug 27845 has been marked as a duplicate of this bug. ***

CC: (none) => joequant

Cristian Pîrîu 2020-12-16 13:02:41 CET

CC: (none) => office

PC LX 2020-12-16 13:06:58 CET

CC: (none) => mageia

Comment 12 Brian Rockwell 2020-12-16 14:22:46 CET
Aurelian,
As I said, your originally provided command went to a blank screen as well.  So I messed with the initd settings seeing if I could convince it to run.

Probably not valuable, but noted it anyhow.
Comment 13 Thomas Backlund 2020-12-16 14:35:45 CET
 
ok, so atleast one R4 and one R5 APU fails
Comment 14 Cristian Pîrîu 2020-12-16 15:03:56 CET
Created attachment 12093 [details]
boot stage
Comment 15 Cristian Pîrîu 2020-12-16 15:04:22 CET
Same problem on my dual GPU laptop, both are R7 (Carizzo + M340). Using the "nomodeset" loading option brings me to this stage.
Comment 16 Thomas Backlund 2020-12-16 23:40:11 CET
There is now a few packages to test:

kernel-5.10.1-2.mga8
x11-driver-video-amdgpu-19.1.0-6.mga8
mesa-20.3.1-1.mga8 (currently building)
Comment 17 Thomas Backlund 2020-12-16 23:42:28 CET
(In reply to Cristian Pîrîu from comment #15)
> Same problem on my dual GPU laptop, both are R7 (Carizzo + M340). Using the
> "nomodeset" loading option brings me to this stage.

yeah, that happends because "nomodeset" blocks modesetting drivers, and amdgpu is a modeset-only driver, so its prevented to load properly
Comment 18 Cristian Pîrîu 2020-12-17 10:08:05 CET
(In reply to Thomas Backlund from comment #17)
> (In reply to Cristian Pîrîu from comment #15)
> > Same problem on my dual GPU laptop, both are R7 (Carizzo + M340). Using the
> > "nomodeset" loading option brings me to this stage.
> 
> yeah, that happends because "nomodeset" blocks modesetting drivers, and
> amdgpu is a modeset-only driver, so its prevented to load properly

I'm sorry, my message was not clear. Using the "nomodeset" option allows login as root in "safe mode", without it is impossible. The problem persists with kernel-5.10.1-2.mga8. It seems to me that the amdgpu chooses the wrong gpu/output in the dual GPU configuration.
Comment 19 Thomas Backlund 2020-12-17 13:50:11 CET
(In reply to Cristian Pîrîu from comment #18)

> I'm sorry, my message was not clear. Using the "nomodeset" option allows
> login as root in "safe mode", without it is impossible. The problem persists
> with kernel-5.10.1-2.mga8. It seems to me that the amdgpu chooses the wrong
> gpu/output in the dual GPU configuration.


can you ssh into it to see if it otherwise seems to work ?
anything special in dmesg or journal ?
Comment 20 Cristian Pîrîu 2020-12-18 09:16:46 CET
Created attachment 12103 [details]
boot log with nomodeset and vga=3D4 options

Bootlog attached, I didn't see anything strange in it. In another system (desktop), with integrated gpu (disabled in the BIOS) and a separate RX550 GPU, everything works as expected.
Comment 21 Thomas Backlund 2020-12-18 10:46:05 CET
can you set up so you have ssh access to the system and then boot without adding the "nomodeset" part so we maybe can see how the system actually gets and get the logs from it
Comment 22 Cristian Pîrîu 2020-12-18 11:49:40 CET
Good news, everyone! Adding "amdgpu.dc=0" to kernel boot parameters is a temporary solution, everything works ok, for now. I have not been able to access my laptop using ssh.
Morgan Leijström 2020-12-18 12:05:39 CET

Assignee: bugsquad => kernel
CC: (none) => fri

Comment 23 Thomas Backlund 2020-12-18 18:05:48 CET
(In reply to Cristian Pîrîu from comment #22)
> Good news, everyone! Adding "amdgpu.dc=0" to kernel boot parameters is a
> temporary solution, everything works ok, for now. I have not been able to
> access my laptop using ssh.

Nice, that could help tracking changes causing this
Comment 24 Brian Rockwell 2020-12-19 16:45:14 CET
Created attachment 12119 [details]
Journal file with amdgpu.dc=0 set

HI All,
The kernel setting amdgpu.dc=0 worked on my APU as well.

I've attached the journal.  You'll find me boot to 5.9.12 then apply latest patches and then try again, but locked.  Go back to 5.9.12 lookup command and then try again with 5.10.1.2 with settings. 

They worked.

Let me know if you need anything else.
Comment 25 Thomas Backlund 2020-12-21 14:00:32 CET
There is now a kernel-5.10.2-1.mga8 building where I've backported another fix from upstream that might fix this...
Comment 26 Cristian Pîrîu 2020-12-22 07:24:22 CET
I can confirm that the kernel-5.10.2-1.mga8 works ok, as intended.
Comment 27 Aurelien Oudelet 2020-12-22 07:29:16 CET
(In reply to Thomas Backlund from comment #25)
> There is now a kernel-5.10.2-1.mga8 building where I've backported another
> fix from upstream that might fix this...

(In reply to Cristian Pîrîu from comment #26)
> I can confirm that the kernel-5.10.2-1.mga8 works ok, as intended.

It seems we have a good candidate fix on this issue.
I pinged French forum IRC to relay test on this.
Comment 28 Brian Rockwell 2020-12-22 19:42:53 CET
Installed update and removed amdgpu.dc=0 from kernel parms.

System booted and is working fine.

$ uname -a
Linux linux.local 5.10.2-desktop-1.mga8 #1 SMP Mon Dec 21 13:01:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Need anything from me?
Comment 29 Aurelien Oudelet 2020-12-22 19:52:57 CET
(In reply to Brian Rockwell from comment #28)
> Installed update and removed amdgpu.dc=0 from kernel parms.
> 
> System booted and is working fine.
> 
> $ uname -a
> Linux linux.local 5.10.2-desktop-1.mga8 #1 SMP Mon Dec 21 13:01:59 UTC 2020
> x86_64 x86_64 x86_64 GNU/Linux
> 
> Need anything from me?

Thanks testing this.

Closing this.

Status comment: (none) => fixed in 5.10.2 and updated nonfree firmware
Summary: Kernel 5.10.1 X86_64 (desktop) - unable to see login after update => Some AMD Radeon cards with kernel 5.10.1 can result of non working display

Comment 30 Aurelien Oudelet 2020-12-22 19:53:29 CET
Really.

Status: NEW => RESOLVED
Resolution: (none) => FIXED