Bug 28154 - Live media not booting with some Radeon HD GPU or system installed with Classic ISO
Summary: Live media not booting with some Radeon HD GPU or system installed with Class...
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: Release (media or process) (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: release_blocker major
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords: IN_ERRATA8
: 27878 28232 28234 (view as bug list)
Depends on:
Blocks:
 
Reported: 2021-01-18 03:05 CET by Thomas Andrews
Modified: 2022-01-19 15:54 CET (History)
9 users (show)

See Also:
Source RPM:
CVE:
Status comment:


Attachments
dmesg from terminal before "startx" (69.51 KB, text/plain)
2021-01-23 16:04 CET, Thomas Andrews
Details
journal from a locked-up sddm login (162.10 KB, text/plain)
2021-01-23 16:06 CET, Thomas Andrews
Details
Journal from a boot using run level 3 and "startx." (177.45 KB, text/plain)
2021-01-23 16:14 CET, Thomas Andrews
Details
dmesg after Plasma desktop is showing from a "startx" boot (69.77 KB, text/plain)
2021-01-23 16:38 CET, Thomas Andrews
Details
as requested by Aurelien Oudele (3.48 KB, text/plain)
2021-01-27 16:02 CET, peter winterflood
Details
screen photo of failed startx (658.69 KB, image/jpeg)
2021-01-27 16:07 CET, Thomas Andrews
Details
Boot up processes in rescue mode (54.07 KB, text/plain)
2021-01-27 19:20 CET, ian trump
Details
System still hangs after the updates with one more ERROR included (490.70 KB, image/jpeg)
2021-01-28 01:34 CET, ian trump
Details
No "nokmsboot" when entered grub screen and now Sytem boots correctly. (903.69 KB, image/jpeg)
2021-01-28 09:06 CET, ian trump
Details
Latest updates prevents system booting again (18.60 KB, image/jpeg)
2021-02-01 02:49 CET, ian trump
Details
Latest updates prevents system booting again (18.60 KB, image/jpeg)
2021-02-01 02:51 CET, ian trump
Details
Boot process hangs with ati driver (870.92 KB, image/jpeg)
2021-02-01 15:49 CET, ian trump
Details
Screensot of where system hangs with AIT radeon driver (870.92 KB, image/jpeg)
2021-02-01 15:51 CET, ian trump
Details
cat /etc/X11/xorg.conf (8.02 KB, text/plain)
2021-02-01 20:03 CET, ian trump
Details
lspcidrake -v file (8.09 KB, text/plain)
2021-02-03 22:34 CET, ian trump
Details
dmesg.txt file (7.74 KB, text/plain)
2021-02-03 22:35 CET, ian trump
Details
dmesg.txt (66.62 KB, text/plain)
2021-02-03 23:04 CET, ian trump
Details
amd list in drakx11 from updated ldetect.lst (52.65 KB, image/jpeg)
2021-02-08 14:46 CET, Thomas Andrews
Details

Description Thomas Andrews 2021-01-18 03:05:13 CET
Description of problem:The Live Mageia 8 RC isos dated 2021/1/13 do not boot correctly on a Dell Dimension e520 with a Radeon HD 8570 video card. Plasma and Xfce both boot to a black screen similar to that described in Bug 26829. Gnome will boot, but freezes up within a minute or two. The workaround from Bug 26829 does not help.

The CI isos will boot, and allow you to install. So far I have installed Plasma and Gnome DEs at the same time. But, I have to use the "Radeon HD 5000 and later without free driver (vesa/fgirx)" option, or either the system won't boot, or it freezes up shortly after booting.

Another system with a Radeon GPU, this one the HD 8490, is unaffected. It works just fine with the driver the system selects for it.


How reproducible: Every time.

For now, I'll label it as "Major," as only the one GPU seems to be affected, and installed systems can be worked around fairly easily if the user makes the effort to learn how. There may be a kernel option that would allow the Lives to work, but I don't know what it is.

Sorry I didn't see this sooner - this hardware has been in temporary storage for the last month or so. It worked with earlier isos.
Comment 1 Thomas Backlund 2021-01-18 08:13:36 CET
is radeon-firmware installed?

is x11-driver-video-amdgpu installed ?


if you remove /etc/X11/xorg.conf and reboot, does it work then ?
(take a backup of it first)

what's the output of lspcidrake -v |grep -i card
Comment 2 Thomas Andrews 2021-01-18 17:11:43 CET
(In reply to Thomas Backlund from comment #1)
> is radeon-firmware installed?
> 
Yes. 

> is x11-driver-video-amdgpu installed ?
> 
Yes. So is x11-driver-video-ati.
> 
> if you remove /etc/X11/xorg.conf and reboot, does it work then ?
> (take a backup of it first)
> 
No. I booted with "splash quiet" removed from the kernel options, and it went on for a while, finally stalling out. Just before the stall-out, I saw two messages that went "*ERROR* VGACON disables amdgpu kernel modesetting." Unknown if it actually means anything.

> what's the output of lspcidrake -v |grep -i card

Card:ATI Radeon HD 6400 and later (radeon/fglrx): Advanced Micro Devices, Inc. [AMD/ATI]|Oland [Radeon HD 8570 / R5 430 OEM / R7 240/340 / Radeon 520 OEM] [DISPLAY_VGA] (vendor:1002 device:6611 subv:1028 subd:210b)
Comment 3 Thomas Andrews 2021-01-18 17:18:34 CET
FWIW, this card has also malfunctioned in Mageia 7, but it only seems to affect sddm. If I boot into run level 3 and run "startx" or if I set autologin, or if I use a different DM, everything was fine. The same was true of the Cauldron beta 1 isos.

See bug 26994.
Comment 4 Aurelien Oudelet 2021-01-18 22:12:21 CET
If you set x11-driver-video-ati, it should not load amdgpu kernel modesetting.
Seems a misdirection here?

Assigning Kernel and Drivers maintainers.

CC: (none) => ouaurelien
Assignee: bugsquad => kernel

Comment 5 Thomas Andrews 2021-01-22 20:15:19 CET
Moving this to release blocker, because we need to get to the bottom of it before we can know just how widespread it is, how many AMD gpus are affected. If not many, then the priority can be reconsidered.

I'm off now to see if I can gather more information.

Priority: Normal => release_blocker

Comment 6 Thomas Backlund 2021-01-22 21:44:30 CET
well, if only sddm gets into trouble, and other dm's work, its not really a kernel issue...

but on this setup I'd like you to test the: sddm-0.19.0-6.mga8 that is in Cauldron Core Updates Testing.
Comment 7 Thomas Andrews 2021-01-22 22:11:51 CET
I just did. If sddm is the DM, and the ati driver is being used, the login here acts like the one in Mageia 7 (Bug 28170)

Updating to sddm-0.19.0-6.mga8 made no difference.
Comment 8 Thomas Andrews 2021-01-22 23:50:21 CET
OK, let me try to summarize what I am seeing better:

Plasma Live: Boots to a black screen with an arrow-shaped mouse cursor that moves with the mouse. No response to mouse clicks, right or left. No response to keyboard. (It's possible that a left-click would bring up a context menu if I wait long enough, but I didn't have the patience.)

Xfce Live: Boots to a desktop with a background that looks like so much noise. Mageia Welcome looks normal. Top and bottom panels look normal. Mouse cursor that moves with the mouse, but no response anywhere from keyboard or mouse.

Gnome Live: Boots to a working desktop at first, but after a 1-2-3 minutes it freezes. Mouse cursor won't move, no response to clicks or keyboard.

64-bit CI: Seems to work, as far as it goes. I was able to install Plasma and Gnome.

The only way I could boot to a working desktop was to use the vesa driver option from the list of Radion drivers. After using urpmi --auto-update to get the latest updates, this is the situation:

Drakx11 suggests that I use the "HD 6400 and later" driver, rather than the "HD 5000 to HD 6300" driver suggested by Mageia 7. But it doesn't matter which of those two is selected; the "ati" driver is the one that's used.

If sddm is used as the DM with the ati driver, the login screen is unresponsive. If runlevel 3 is used to log in, followed by "startx" the result is the black screen from earlier in this comment. Right clicking will eventually bring up the Plasma logout screen (logout, shutdown.restart, etc), and one can reboot or shut down.

If gdm is used to login to Plasma, the result is a light gray-blue screen with a movable cursor. A right-click will bring up the context menu, which then responds to clicks. Choosing "Enter edit mode" brings up the desktop, and it stays up after leaving edit mode. Things act more or less normally for the rest of the session, though I didn't stay for long.

If gdm is used to log into Gnome (Wayland? It just said "Gnome"), I get the same symptoms I saw with Gnome Live. Gnome Classic does the same, except that the cursor doesn't freeze. Same thing for "Gnome on Xorg."

So it isn't just sddm, it's everything. My next step? I'll try installing Gnome and Plasma on another machine, this one with an AMD HD 8490 GPU. That one also uses the ati driver. Unfortunately, that machine uses an AMD processor, where my problem child is Intel. But, we'll see. I'll report on that one tomorrow.
Comment 9 Thomas Backlund 2021-01-23 00:37:45 CET
when you get to terminal before using startx, what does:

systemctl status sddm say ?

also please provide dmesg and journal logs so we maybe can get some info where it fails...
Comment 10 Thomas Andrews 2021-01-23 02:24:18 CET
On other hardware, AMD Phenom II X4 910, with an HD 8490 video card, also using the ati driver:

All Lives boot to working desktops, and don't freeze. The Xfce background at first is that noise display with Mageia Welcome on top of it, but it switches to the Mageia background within a few seconds.

After the installation, both gdm and sddm boot to working desktops, both in Plasma and Gnome. No freezing whatsoever, and everything works.

So, the differences between the systems, non-working vs working:

The HD 8570 is an "oland" chip, while the HD 8490 is a "caicos" chip. 
The Core2Quad vs the AMD Phenom II X4.
Four GB of RAM vs eight.
Viewsonic monitor with a DVI connection vs a Dell monitor that's VGA-only, connected through a DVI-I to VGA adapter.

Thomas, I will get the info you requested - tomorrow. Right now, I need some food.
Comment 11 Thomas Andrews 2021-01-23 16:00:13 CET
(In reply to Thomas Backlund from comment #9)
> when you get to terminal before using startx, what does:
> 
> systemctl status sddm say ?
> 
● sddm.service - Simple Desktop Display Manager
     Loaded: loaded (/usr/lib/systemd/system/sddm.service; enabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: man:sddm(1)
             man:sddm.conf(5)

> also please provide dmesg and journal logs so we maybe can get some info
> where it fails...

On the way.
Comment 12 Thomas Andrews 2021-01-23 16:04:35 CET
Created attachment 12250 [details]
dmesg from terminal before "startx"
Comment 13 Thomas Andrews 2021-01-23 16:06:00 CET
Created attachment 12251 [details]
journal from a locked-up sddm login
Comment 14 Thomas Andrews 2021-01-23 16:14:51 CET
Created attachment 12252 [details]
Journal from a boot using run level 3 and "startx."

At first this boot showed a black screen with a cursor. LED indicated lots of hard drive activity. Once that stopped, I was able to right-click to get a context menu, selected "enter edit mode" and the desktop appeared, without the panel. Exited edit mode, still no panel, but when I moved the cursor over where it would be, the panel appeared. Once all that had happened, the desktop acted normally.
Comment 15 Thomas Andrews 2021-01-23 16:38:18 CET
Created attachment 12253 [details]
dmesg after Plasma desktop is showing from a "startx" boot
Comment 16 Aurelien Oudelet 2021-01-23 17:22:08 CET
@Thomas Andrews:
The only lines relevant for this bug are:
> kernel: radeon_dp_aux_transfer_native: 116 callbacks suppressed

For your hardware, using "radeon" module is not the good one. It seems Comment 12 and 13 that it loads it and gives you a framebuffered radeondrmfb console and seems to prevent Xorg autoconfig.

(In reply to Thomas Andrews from comment #15)
> Created attachment 12253 [details]
> dmesg after Plasma desktop is showing from a "startx" boot
In this, I really can't see Plasma starting up...

Please for the test following, remove "quiet" and "splash" in Kernel command line by removing them in /etc/default/grub on this key: GRUB_CMDLINE_LINUX_DEFAULT= blabla

For this test, also remove unnecessary Xorg.conf settings about your device, please leave Xorg autoconfig runs itself. Don't add a specific value/module for your card.

We should know which driver is suitable: ati? radeon? amdgpu?
To best try this is making as root the following file:

/etc/modprobe.d/10-blacklist-amd.conf
blacklist radeon
blacklist radeondrmfb

and issue later
# dracut -f

If it no longer boot, initrd is damaged by blacklisting the good module.

Test as far as you can by replacing "radeon" by "ati" or "amdgpu". I don't know which other name ATI/RADEON/AMD compatible modules...

At a latter clue, can you try to add "nomksboot" to Kernel command line.
Comment 17 Aurelien Oudelet 2021-01-23 17:25:30 CET
Also, please remove the "vga=791" from Kernel command line as it can messed DRM stuff sometimes.
Comment 18 Thomas Andrews 2021-01-23 21:10:08 CET
I've been doing some investigating of my own...

I bought this card after the Mageia 7.1 isos were released, and I never did try the Live Plasma 7.1 iso with it. I did a little while ago. It didn't boot into Live mode, either. That means this problem has been around for a while.

According to the release notes for the latest AMD software, dated a month ago, at
https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-45
"AMDGPU, our all-open graphics driver for Linux, can be used with all our Radeon HD 7000 series or newer. " Other references dated earlier note that there is some sort of "switch" that has to be done when the kernel mods are built to activate support for cards like mine, but these release notes make no mention of it that I see.

Our drakx11 chooses the ati driver for the cards that are "HD 6400 and newer." Perhaps that should be changed, though it's probably too late to do it for Mageia 8. 

A Gentoo wiki page, https://wiki.gentoo.org/wiki/AMDGPU says that the amdgpu driver will work with my HD 8570 card, but that radeon needs to be blacklisted. (Just as you suggest)

The ati driver is the one that's set up now. I guess the easiest thing to try first is to blacklist radeon and see what happens. Be back in a while.
Comment 19 Thomas Backlund 2021-01-23 21:48:50 CET
(In reply to Thomas Andrews from comment #18)

> According to the release notes for the latest AMD software, dated a month
> ago, at
> https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-45
> "AMDGPU, our all-open graphics driver for Linux, can be used with all our
> Radeon HD 7000 series or newer. " Other references dated earlier note that

Yeah, I have fixes queued to switch all gpus supported by amdgpu to use that as that's where all the development / enhancement happends.

the radeon/ati bits dont really get any real attention to fixes anymore...
Comment 20 Thomas Andrews 2021-01-24 01:08:01 CET
I'm about ready to rip this stupid card out of this system and go back to the on-board Intel video...

"amdgpu" won't work at all. Doesn't matter what is or isn't blacklisted. After what little reading I've done, my guess is that before it can work with this gpu it has to be configured differently when it's built. 

"ati" won't work at all if "radeon" is blacklisted. If "radeondrmfb" is blacklisted but not radeon, or if nothing "radeon" is blacklisted, you get an unresponsive sddm login screen.

If xorg.conf is set to use "radeon" (and of course no blacklisted radeonstuff) you also get an unresponsive sddm login screen.
Comment 21 Aurelien Oudelet 2021-01-24 22:11:12 CET
Have you tried this in Kernel command line:

radeon.modeset=0

?
Comment 22 Giuseppe Ghibò 2021-01-24 22:37:09 CET
You said that it just happens in sddm, which actually seems having also other problems (and on other chipsets than AMD) and not on any other DMs. Maybe raise the attention over sddm and reporting the problem to upstream sddm?

CC: (none) => ghibomgx

Comment 23 Thomas Andrews 2021-01-25 02:34:19 CET
(In reply to Aurelien Oudelet from comment #21)
> Have you tried this in Kernel command line:
> 
> radeon.modeset=0
> 
> ?

Didn't work. It stalls out with the screen indicating it happened shortly after doing some stuff with the rtl8192cu wifi adapter.
Comment 24 Aurelien Oudelet 2021-01-26 10:10:09 CET
*** Bug 27878 has been marked as a duplicate of this bug. ***

CC: (none) => r0bur

Comment 25 Aurelien Oudelet 2021-01-26 10:11:49 CET
(In reply to Ihar Areshchankau from Bug 27878)
> Graphics:
>   Device-1: AMD RS690 [Radeon X1200] driver: radeon v: kernel 
>   Display: server: Mageia X.org 1.20.10 driver: ati,radeon,v4l 
>   resolution: 1920x1080~60Hz 
>   OpenGL: renderer: ATI RS690 v: 2.1 Mesa 20.2.3 

It seems this machine is also affected by the errors here.
I asked an output of lspcidrake -v

Note that M7 works for him...
Comment 26 Thomas Backlund 2021-01-26 11:42:52 CET
(In reply to Thomas Andrews from comment #2)

> Card:ATI Radeon HD 6400 and later (radeon/fglrx): Advanced Micro Devices,
> Inc. [AMD/ATI]|Oland [Radeon HD 8570 / R5 430 OEM / R7 240/340 / Radeon 520
> OEM] [DISPLAY_VGA] (vendor:1002 device:6611 subv:1028 subd:210b)

With ldetect-lst-0.6.23-2.mga8 I've switched over only this card from radeon to amdgpu so we can see if it helps.

So please test with that to re-configure your system, or wait for next round of RC isos to test with...
Comment 27 Thomas Andrews 2021-01-26 14:47:35 CET
(In reply to Thomas Backlund from comment #26)
> (In reply to Thomas Andrews from comment #2)
> 
> > Card:ATI Radeon HD 6400 and later (radeon/fglrx): Advanced Micro Devices,
> > Inc. [AMD/ATI]|Oland [Radeon HD 8570 / R5 430 OEM / R7 240/340 / Radeon 520
> > OEM] [DISPLAY_VGA] (vendor:1002 device:6611 subv:1028 subd:210b)
> 
> With ldetect-lst-0.6.23-2.mga8 I've switched over only this card from radeon
> to amdgpu so we can see if it helps.
> 
> So please test with that to re-configure your system, or wait for next round
> of RC isos to test with...

I'll see what I can do. I may have to try both. I've played with this install so much without really knowing what I'm doing that I don't know how close it is to an original M8 (with official updates) anymore.

I suspect this problem may affect the entire "Oland" family of gpus, but of course without more hardware I have no way of verifying that. Unfortunate.
Comment 28 Thomas Andrews 2021-01-26 23:27:48 CET
I used runlevel 3/startx to bring up the system as it was, much as I described in Comment 14. Desktop looked good, worked as it should. So, I used urpmi --auto-update to get all updates, including ldetect.lst and the new kernel. Then I used drakx11 through MCC to reconfigure the graphics server, where it indicated I should use the Volcanic Islands (amdgpu) option, which I did. 

Then I rebooted, with the kernel options still set as Aurelien wanted in Comment 16, except that if radeon is blacklisted it was done by drakx11. It acted exactly as it did in Comment 23. It never gets near the point where sddm would try to run. 

Not mentioned before, whenever radeon is blacklisted, or when "radeon.modeset=0" is in the kernel options, the resolution of the text on the screen does not change the way it does when the radeon driver is, um, active (? probably the wrong term). I have not tried it, but past experience tells me that if "splash quiet" were put back, the infamous three question marks would be showing during the boot, rather than the plymouth animation sequence.
Comment 29 Aurelien Oudelet 2021-01-27 11:47:43 CET
*** Bug 28232 has been marked as a duplicate of this bug. ***

CC: (none) => chrisv11c

Comment 30 Aurelien Oudelet 2021-01-27 11:49:35 CET
Reported this day:

this card:

Card:ATI Radeon HD 6400 and later (radeon/fglrx): Advanced Micro Devices, Inc. [AMD/ATI]|Tahiti XT [Radeon HD 7970/8970 OEM / R9 280X] [DISPLAY_VGA] (vendor:1002 device:6798 subv:174b subd:3001)

with current all updated version in Cauldron can no longer boot.

Message is:
[drm:radeon_init [radeon]] *ERROR* No UMS support in radeon module!

Added here reporter in Comment 29.
Comment 31 Thomas Backlund 2021-01-27 12:21:04 CET
(In reply to Aurelien Oudelet from comment #30)

> Message is:
> [drm:radeon_init [radeon]] *ERROR* No UMS support in radeon module!
> 

this means it tries to use radeon, but either "nomodeset", "nokmsboot" or "radeon.modeset=0" is used on kernel command line.
Comment 32 Thomas Andrews 2021-01-27 15:38:16 CET
(In reply to Thomas Backlund from comment #31)
> (In reply to Aurelien Oudelet from comment #30)
> 
> > Message is:
> > [drm:radeon_init [radeon]] *ERROR* No UMS support in radeon module!
> > 
> 
> this means it tries to use radeon, but either "nomodeset", "nokmsboot" or
> "radeon.modeset=0" is used on kernel command line.

I just checked my problem system as it was booted in Comment 28, and discovered that "nokmsboot" had been added to the kernel command line. Must have been drakx11 that did it - *I* didn't.

So I tried removing it, and continued the boot. This time the resolution changed, and things seemed to move along more quickly, but when it got to about the same spot as before, up popped the notice that the selected driver requires "nokmsboot."

Visions of the movie "Catch-22" come to mind.
Comment 33 Aurelien Oudelet 2021-01-27 15:49:52 CET
*** Bug 28234 has been marked as a duplicate of this bug. ***

CC: (none) => peter.winterflood

Aurelien Oudelet 2021-01-27 15:50:41 CET

Summary: Live media not booting with Radeon HD 8570 GPU; also radeon driver not working with HD 8570 GPU if installed from the CI => Live media not booting with some Radeon HD GPU or system installed with Classic ISO

Comment 34 Aurelien Oudelet 2021-01-27 15:53:35 CET
See also attachments of Bug 28234 from peter winterflood
attachment 12268 [details]
attachment 12269 [details]
attachment 12270 [details]
Comment 35 peter winterflood 2021-01-27 16:02:24 CET
Created attachment 12271 [details]
as requested by Aurelien Oudele

t lspcidrake from asus N68C with hd2400  and Phenom II 955
Comment 36 peter winterflood 2021-01-27 16:05:42 CET
correction, not an asus, its an asrock, N68C
Comment 37 Thomas Andrews 2021-01-27 16:07:08 CET
Created attachment 12272 [details]
screen photo of failed startx
Comment 38 Thomas Andrews 2021-01-27 16:11:21 CET
Tried rebooting into runlevel 3 with and without "nokmsboot." Then tried "startx." Both failed. 

The screen photo is from without nokmsboot. Note that among the messages is one that says the DRM version is 2.5.0 but the driver is only compatible with 3.0.0.
Comment 39 Thomas Backlund 2021-01-27 16:23:42 CET
(In reply to Thomas Andrews from comment #38)
> Tried rebooting into runlevel 3 with and without "nokmsboot." Then tried
> "startx." Both failed. 
> 
> The screen photo is from without nokmsboot. Note that among the messages is
> one that says the DRM version is 2.5.0 but the driver is only compatible
> with 3.0.0.

ok, so there is some issues in the amdgpu stack atleast for that card.

I've reverted the test-switch to amdgpu for your card in ldetect-lst-0.6.24-2.mga8

After that is installed you should be able to reconfigure it for radeon/ati setup
Comment 40 Thomas Backlund 2021-01-27 16:42:23 CET
and with drakx-kbd-mouse-x11-1.32-2.mga8 installed, it will stop adding nokmsboot so we get back to normal behaviour...
Comment 41 ian trump 2021-01-27 18:27:26 CET
I would like to say that my installation of mageia 8 beta 2 was performed using the classic installation iso and not with the live ISo.

As far as I can tell my system will boot and work normally only if I select the rescue mode prior to booting the system, other wise it just hangs as shown in the screenshot I supplied earlier today.
Comment 42 ian trump 2021-01-27 19:20:27 CET
Created attachment 12274 [details]
Boot up processes in rescue mode

This is the dmseg file which shows the start up processes of my system when the rescue mode is selected to  allow the system to fully boot up into the GUI.
Comment 43 Thomas Andrews 2021-01-27 21:35:40 CET
SUCCESS!

Well, with my chip, anyway. I was able to use the procedure I found at 
https://inside-out.xyz/technology/making-darktable-to-use-the-amd-radeon-gpu.html 
to get amdgpu to work with my HD 8570 gpu. The sddm login now works, and the Plasma DE now comes up normally.

The site's author was trying to get Darktable to work with an AMD gpu from a family similar to mine, and this was his first step. He went on from there to install AMDGPU PRO, but I stopped short of doing that.

The key seems to have been an amd.conf file in /etc/modprobe.d that contains the following:

blacklist radeon
options amdgpu cik_support=1
options amdgpu si_support=1
options radeon cik_support=0
options radeon si_support=0

The web page also shows a line in xorg.conf under the graphic device that says Option "TearFree" "true" at the end. I don't know if it's necessary or not, but it doesn't seem to be doing any harm.
Comment 44 ian trump 2021-01-28 01:34:03 CET
Created attachment 12275 [details]
System still hangs after the updates with one more ERROR included

This screenshot was taken after updating the kernel to 5.10.11.1 as well as drakx-kbd-mouse-x11-1.32-2.mga8 including various other components.

The boot up process still hangs in the same place with the same ERROR message including one more and both have been duplicated.

I then restarted the PC and selected  the recovery mode and the system was then able to boot up properly and enter the GUI where everything appears to work as usual.
Comment 45 Thomas Backlund 2021-01-28 01:36:03 CET
remove "nokmsboot" from kernel command line
Comment 46 ian trump 2021-01-28 01:57:25 CET
If I new how to perform that task I would certainly have go but I do not.
Comment 47 Thomas Andrews 2021-01-28 02:38:33 CET
Simplest way for a trial is to press "e" at the grub screen.Then use the arrow keys to maneuver to where it says "nokmsboot" and delete that. When finished, press crtl-x to continue the boot.

That change only lasts for that boot. To change it more permanently, go to MCC/Boot/Set up the boot system. Follow the prompts, and when the kernel command line appears, remove nokmsboot from it.
Comment 48 ian trump 2021-01-28 09:06:25 CET
Created attachment 12277 [details]
No "nokmsboot" when entered grub screen and now Sytem boots correctly.

I selected  'e' at the grub screen and from this screen shot it shows that "nokmsboot" is not present.

I then pressed crtl-x as there was nothing to delete and the system booted as normal. 

I thought Ok so I rebooted the system and this time I just left it run as normal without selecting the rescue mode and all of a sudden the system booted as normal without hanging until it finally entered the GUI and everything was normal again.

I repeated the reboot and once again the system booted as normal.

Not sure what happened but I did not change or delete anything and everything seems to work now. Strange.

I would be grateful if you can explain why it now works as I really did not do anything.
Comment 49 ian trump 2021-01-28 09:53:47 CET
Can someone tell me how long it generally takes to boot the system as I have two different types of drives I use.

Here I am using an old  Western Digital 640 GB ATA drive to test Mageia 8  and it would appear to take just over one minute to fully boot.

I would like to know if this time is acceptable for this type of drive considering the advancements linux has made.
Comment 50 Thomas Andrews 2021-01-28 18:05:29 CET
(In reply to ian trump from comment #49)
> Can someone tell me how long it generally takes to boot the system as I have
> two different types of drives I use.
> 
> Here I am using an old  Western Digital 640 GB ATA drive to test Mageia 8 
> and it would appear to take just over one minute to fully boot.
> 
> I would like to know if this time is acceptable for this type of drive
> considering the advancements linux has made.

This is not the place to ask this question, unless it pertains to the bug, which in this case it doesn't. A better place would be https://forums.mageia.org/en/ or on the "discuss" mailing list. Another place to ask is on Usenet at alt.os.linux.mageia.

That said, I don't think a minute is bad at all, but boot time depends on more than just drive speed.
Comment 51 Thomas Andrews 2021-01-28 18:29:11 CET
Because of the way AMD chose to label their GPUs, the description we use on ldetect.lst is confusing to many users. I know it confused me when I first looked at it. 

One would think that the HD 8490 belongs in the "HD 6400 and newer" category, when because of its architecture it actually doesn't. One would think that both the HD 8490 and the HD 8570 are newer than an HD 7790, but they aren't.

Taking a cue from the "Volcanic Islands and newer" description, if we switch to using the amdgpu driver with cards like mine, that ldetect.lst description should maybe be changed to something like "Southern Islands and Sea Islands." 

In the above example, the HD 8490 is an "Evergreen" chip, and uses the ati drriver. The HD 8570 is a "Southern Islands" chip, and the HD 7790 is a "Sea Islands" chip.

Just a thought...
Comment 52 Thomas Backlund 2021-01-28 20:18:06 CET

yeah, we have a lot of "old" naming split dating back to when we had fglrx and it first supported almost all gpus, then it switched to HD2000 and newer, then it supported HD5000 and newer, some worked without firmwares, others needed SSE and so on...then some special cards that only was supported by fglrx or vesa, then newer cards got added with only amdgpu support, and after that the nice folks at amd  started adding support for older cards to the new amdgpu wich is what you see...

Then you have some HD8xxx mobile cards that actually are re-branded HD7xxx and so on...

So all in all a nice mess..

I've been thinking of simply calling them something like AMD Legacy driver (radeon), and AMD ??? driver (amdgpu)... or something like that...
even if its nice to show something like "HD6000 and newer" that target changes depending on what AMD wants to support with the new amdgpu... and using the "Evergreen" or "Southern Islands" does not help users that dont know anything about their hw other than "my computer have a graphi card" ...

but as this hooks into several places in the installer, I wont touch the mess for Mageia 8
Comment 53 peter winterflood 2021-01-28 20:19:32 CET
I agree, a good refrence site for this ive found recently is 

https://www.techpowerup.com/gpu-specs/radeon-hd-2400.c2087

ive also bought two 7470's to expand on my testing.

regards peter
Comment 54 peter winterflood 2021-01-29 20:51:12 CET
confirmed that this bug, as filled, by me on Bug 28234 whether a duplicate or not has been fixed with the 29/01/2020 classic x86_64 on the same hardware it appeared on, but there are some differences between this and Bug 28234, so i wont change the status.
regards peter
Comment 55 Thomas Andrews 2021-01-29 22:23:11 CET
As expected, not fixed for me. The Live Plasma iso stops short of going into Live mode. You hear the tones, then have a black screen with a cursor.

Interesting though, is that it's not necessarily a completely failed boot. I tried a ctrl-alt-backspace several times, until the arrow blinked, went away for a second, then came back as something resembling a capital "I". If I moved the cursor around, it changed back into an arrow, indicating there's some invisible text in the middle of the screen. I clicked on that, nothing. Then I tried pressing "enter" and a seemingly fully-functional Live Plasma desktop came up.

So I shut it down and tried again, but this time I only got the background and a Mageia Welcome screen. I could toggle the "don't show this" box, but nothing else. 

So, that procedure would seem to be imperfect. Oh, well.
Comment 56 Thomas Backlund 2021-01-29 23:10:03 CET
yeah, I forgot about how amd handles the drivers when both radeon and amdgpu is built....


So in your case with a Southern Island card, boot with:
radeon.si_support=0 amdgpu.si_support=1


For a user with a Sea Island GPU, boot with:
radeon.clk_support=0 amdgpu.clk_support=1
Comment 57 Thomas Andrews 2021-01-29 23:52:50 CET
Yes, that takes care of it.

Questions: What if Martin were just to add those options to the kernel line on the Lives by default? Would it affect gpus in other AMD families?
Comment 58 Thomas Backlund 2021-01-30 00:11:13 CET
well,

SI bit is for:  Tahiti, Pitcairn, Oland, Verde, Hainan
CIK bit is for: Kaveri, Bonaire, Hawaii, Kabini, Mullins
Comment 59 Thomas Andrews 2021-01-30 00:40:07 CET
I meant would it affect other than those two families. Well, I do have a HD 8490 (Caicos chip, Evergreen family, radeon driver). Guess I'll give it a try and see what happens...
Comment 60 Thomas Backlund 2021-01-30 00:42:44 CET
nope, it only affects SI and CIK as those are the only ones supported by both radeon and amdgpu
Comment 61 Thomas Andrews 2021-01-31 22:58:45 CET
So far, so good. All four Round 5 RC isos boot to a working destop in Live mode.

Installing Plasma from the CI now...
Comment 62 Thomas Andrews 2021-01-31 23:57:12 CET
Also working with Plasma installed from the CI, so for my card atleast, it's now working.

It would be nice if we had some reports from those with other AMD cards, but, lacking that, this bug appears to be fixed.
Comment 63 ian trump 2021-02-01 02:49:53 CET
Created attachment 12295 [details]
Latest updates prevents system booting again

After rebooting from latest updates the system fails to boot properly again and gives a message this time stating, TAHITI not supported in kfd, which I have attached in the screenshot.
Comment 64 ian trump 2021-02-01 02:51:57 CET
Created attachment 12296 [details]
Latest updates prevents system booting again

After rebooting from latest updates the system fails to boot properly again and gives a message this time stating, TAHITI not supported in kfd, which I have attached in the screenshot.
Comment 65 Thomas Andrews 2021-02-01 04:59:03 CET
I get the same message with my Oland card. It shouldn't prevent booting.

I believe your install is still trying to use the ati (radeon) driver, and the latest kernel switches Southern Islands support to the amdgpu driver. (Oland and Tahiti are both Southern Islands cards) So try this:

Boot into recovery mode, and log in as root. Run "drakx11" and use the arrow and enter keys to select the "Volcanic Islands" driver for your card. Even if it looks like that's the one that is in place, you still have to actually choose it. Again use the arrow and enter keys to exit back to the terminal prompt. If you were successful in choosing the new driver, somewhere along the line you'll see a message that you have to reboot for the changes to take effect. If you don't see that, run drakx11 again. You shouldn't need this, but at that point issue a "dracut -f" command, just to be sure. Once that has finished, use the "reboot" command to, well, reboot.

I'm thinking that "should" do the trick.
Comment 66 ian trump 2021-02-01 13:59:53 CET
I followed your instructions and yes I was able to boot into the GUI but it was very slow to do.

I then changed the resolution that was within the specified range of the monitor and rebooted again.

The system would once again not boot into the GUI.

I then tried to change the resolution parameter whilst in root of the rescue mode and there was no joy again so I re-installed a fresh system and applied the updates. This time when I rebooted it could not find the kernel. 

I do not know what has happened here but the display driver that has previously been used with Mageia, and even PClinux which I use both used the Radeon 6400 and later driver and now suddenly Mageia wants to use the volcanic islands driver which was just recently the problem why the system did not boot before.

I do not want to keep re-installing the system with a fresh install as I have already done this quite a number of times so far. 

I think I will keep using mageia 7.1 for now as I have not encountered these types of problems that in my opinion are quite severe  as compared to previous beta releases. I will wait until mageia 8 arrives. I hope you iron out these problems between now and then.
Comment 67 Thomas Backlund 2021-02-01 14:27:11 CET
(In reply to ian trump from comment #66)
> I followed your instructions and yes I was able to boot into the GUI but it
> was very slow to do.
> 
> I then changed the resolution that was within the specified range of the
> monitor and rebooted again.
> 
> The system would once again not boot into the GUI.

Sorry about the trouble.

We are trying to switch to the newer amdgpu for all hw that it supports since thats where the development focus is upstream.

But that requires feedback from testers...

Now, in your:
/etc/X11/xorg.conf

do you have a line that states:

Driver "ati"

if so, can you change that to:

Driver "amdgpu"

and reboot, does that help?

if not, change back to:
Driver "ati"

and boot with this on kernel command line:
radeon.si_support=1 amdgpu.si_support=0
Comment 68 ian trump 2021-02-01 15:49:06 CET
Created attachment 12297 [details]
Boot process hangs with ati driver

This is the screenshot of where I selcted ati driver to boot because volcano driver failed to log in at GUI log in screen. Now the the ati driver has been chosen the system just hangs here.
Comment 69 ian trump 2021-02-01 15:51:04 CET
Created attachment 12298 [details]
Screensot of where system hangs with AIT radeon driver

This is the screenshot of where I selcted ati driver to boot because volcano driver failed to log in at GUI log in screen. Now the the ati driver has been chosen the system just hangs here.
Comment 70 Thomas Backlund 2021-02-01 15:54:54 CET
did you add the:

radeon.si_support=1 amdgpu.si_support=0

part when you tried to boot with ati?
Comment 71 ian trump 2021-02-01 16:50:30 CET
I am at a loss here of what to do and help you as I am not sure of where I am going or what to do now. 

I sincerely would like to help but I am at a cross roads as to what and how to proceed as I am not really qualified to go any further and I fear it would take up so much of your time and resources for you to navigate me around the system to try and help you find the answers you are looking for.
Comment 72 ian trump 2021-02-01 18:44:01 CET
I have installed the the system on another hard drive (ssd) and the system on this drive seems to function ok.

I went ino a terminal and typed "cat /etc/X11/xorg.conf"and from the information displayed here is a snippet: 


Section "Device"
    Identifier "device1"
    VendorName "Advanced Micro Devices, Inc. [AMD/ATI]"
    BoardName "ATI Radeon HD 6400 and later (radeon/fglrx)"
    Driver "amdgpu"
    Option "DPMS"
    Option "AccelMethod" "EXA"
EndSection

The other hard drive that was not booting the system is a SATA 3.5 inch hard drive. I do not know if different drives are the issue here but this is how things stand now here.
Comment 73 ian trump 2021-02-01 20:03:41 CET
Created attachment 12299 [details]
cat /etc/X11/xorg.conf

This is the complete contents of the "cat /etc/X11/xorg.conf" file
Comment 74 Thomas Andrews 2021-02-03 14:15:31 CET
Thomas, I think that Ian's problems will probably be resolved with a clean install from the RC, once the other issues are fixed and it's ready to be released. I know that the current RC test candidate works better with my system than the updated Beta 2 system I was using in earlier comments.

Ian, that's an opinion, not a promise. With hardware issues like this, we can't really know until that particular hardware is tested.

But his situation brings up a point that I hadn't considered before. What will happen to users with these cards that attempt an upgrade install from Mageia 7? Will it break their systems, or can the installer scripts take care of it for them?
Comment 75 Thomas Backlund 2021-02-03 14:26:01 CET
(In reply to ian trump from comment #72)
> I have installed the the system on another hard drive (ssd) and the system
> on this drive seems to function ok.
> 

thanks, so it means your hw does work as intended with the newer driver.

> 
> The other hard drive that was not booting the system is a SATA 3.5 inch hard
> drive. I do not know if different drives are the issue here but this is how
> things stand now here.

Question is on that setup... do you have any references to radeon or amdgpu in any files in /etc/modprobe.d/

if you do, please remove it and update the initrd with the command "dracut -f"
Comment 76 Thomas Backlund 2021-02-03 14:28:24 CET
(In reply to Thomas Andrews from comment #74)
> Thomas, I think that Ian's problems will probably be resolved with a clean
> install from the RC, once the other issues are fixed and it's ready to be
> released. I know that the current RC test candidate works better with my
> system than the updated Beta 2 system I was using in earlier comments.
> 

it will work as he already tested on a different harddisk


> Ian, that's an opinion, not a promise. With hardware issues like this, we
> can't really know until that particular hardware is tested.
> 
> But his situation brings up a point that I hadn't considered before. What
> will happen to users with these cards that attempt an upgrade install from
> Mageia 7? Will it break their systems, or can the installer scripts take
> care of it for them?

there is still a bug in display_driver_helper that I hope to get resolved before final release that should take care of the driver auto-switching just as it does for switching from nvidia to nouveau and so on...
Comment 77 Thomas Andrews 2021-02-03 15:15:12 CET
Oh, good. Just trying to cover all bases as well as we can.

I still have a 32-bit M7 Plasma install on this hardware that was affected, though the only symptom there seemed to be an unresponsive sddm login. I have switched the drivers, and the problem went away. Let me know when the upgrade script is ready to test, and the best way to do it, and I will revert that install to the old driver and attempt an upgrade. I can upgrade through a CI iso or the netinstall iso (which Martin has updated to work with WPA2 wifi encryption).
Comment 78 ian trump 2021-02-03 20:47:31 CET
Although I have said my system works on the ssd drive it would seem to take about two minutes to boot now. Before it would take about thirty seconds.

My other linux system (PClinux) is suffering the same problem now because the latest updates from that system include the amdgpu and now the x server has failed to start.

I  do not know how many other peolple are having the Same problem  here but there must be others with the same hardware as myself and now with my other system affected surely this amdgou module can not be stable enough to be used.

I can send sreenshots of the PClinux x server failure if that would help you, especially as it would appear to effect other linux systems as well now.
Comment 79 Thomas Backlund 2021-02-03 21:13:38 CET
(In reply to ian trump from comment #78)
> Although I have said my system works on the ssd drive it would seem to take
> about two minutes to boot now. Before it would take about thirty seconds.
> 
> My other linux system (PClinux) is suffering the same problem now because
> the latest updates from that system include the amdgpu and now the x server
> has failed to start.
> 
>

is that with an ssd too ?
if so you may be affected by the "random entropy generation bug"... meaning as long as you had a rotating disc it got more random data > faster boot.

you can try at boot to move the mouse or type alot on the keyboard and see if that speeds up the boot
Comment 80 Thomas Backlund 2021-02-03 21:17:04 CET
and if so you can install the haveged package, then enable and start the haveged service

that should help generating the needed random data
Comment 81 ian trump 2021-02-03 21:43:34 CET
The Mageia and PClinux reside on the same sata 2.5  SSD drive.

The other drive is a Sata 3.5 optical spinning disc drive and Mageia refuses to boot into the GUI. PClinux is doing the same now but that it is on the ssd drive (Solid state drive).

I must admit PC linux only takes between ten and twelve seconds to fully boot and I can only presume it is that quick because the do not use systemd but that is just a presumption only and after there updates today the x server has failed because it looks like they are now including the amdgpu.

Looking at the screen dialogue from the PC linux x server failure I can it States the amdgpu driver is only comparable with 3 xxx. From that statement do I take it that my hardware is not comparable to work with it.
Comment 82 ian trump 2021-02-03 22:18:14 CET
Have installed the haveged package and enabled to start from services.

The time is still about two minutes.
Comment 83 Thomas Backlund 2021-02-03 22:23:01 CET
ok, can you provide dmesg and lspcidrake -v


dmesg >dmesg.txt

lspcidrake -v >lspcidrake.txt

and attach  them to this report
Comment 84 ian trump 2021-02-03 22:34:00 CET
Created attachment 12307 [details]
lspcidrake -v file
Comment 85 ian trump 2021-02-03 22:35:01 CET
Created attachment 12308 [details]
dmesg.txt file
Comment 86 Thomas Backlund 2021-02-03 22:38:40 CET
(In reply to ian trump from comment #85)
> Created attachment 12308 [details]
> dmesg.txt file

thats' the xorg.conf
Comment 87 ian trump 2021-02-03 23:04:21 CET
Created attachment 12309 [details]
dmesg.txt
Comment 88 Thomas Backlund 2021-02-03 23:30:08 CET
ok, so we lose 90 sec here:

[    6.557150] EXT4-fs (sda7): mounted filesystem with ordered data mode. Opts: acl
[   96.099684] ACPI: \: failed to evaluate _DSM (0x1001)
Comment 89 ian trump 2021-02-03 23:50:55 CET
I am interested find out if it is a software or hardware problem.
Comment 90 ian trump 2021-02-05 01:10:19 CET
I have researched the amgpu driver and have found out what graphic cards are supported by it.

The cards nearest to my own are the AMD Radeon™ R9 285/290/290X Graphics that are supported.
 
My card is the Radeon R9 280X. I can only presume this is why I am having problems with my system if it is trying to use this driver.

In fact my other Linux system PClinux that had the same problem yesterday where the x server also failed when booting after the updates would appear to have been fixed today because when I reinstalled the system and applied the updates the system rebooted successfully. When I went into the hardware section of the control center it was giving the Radeon 6400 driver as it would normally use. 

This is the link where I have found this information. As my card is not part of this group of cards would it not be better that my system not use the amdgpu driver.

https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-45
Comment 91 ian trump 2021-02-05 01:27:08 CET
What I meant to say was the graphic card to use would be the Radeon HD 6400 and later (radeon/fgrlx) with the Xorg : ati driver .
Comment 92 Thomas Andrews 2021-02-05 02:09:43 CET
Ian, your Tahiti card and my Oland card Are both part of the "Southern Islands" family of gpus, built using AMD's Graphics Core Next (GCN) 1.x architecture. As such, they have been supported by both the ati and the amdgpu drivers, with the ati driver being the usual Linux default.

But AMD has stopped supporting that older driver, while they continue to develop GCN 1.x support in the amdgpu driver. If you haven't, please read the article at 
https://www.phoronix.com/scan.php?page=news_item&px=GCN-1.0-GPU-Reset-AMDGPU

This is why we are switching the Southern Islands and Sea Islands GPUs to the amdgpu driver. We are already seeing some of those gpus stop working with the ati driver and the latest kernels. It will only get worse with the future kernels, and there will be no fixes for the older ati driver coming from upstream. 

I believe that if you install Mageia 8 (Cauldron) from the soon-to-be-released RC(It should be out in less than a week), you will find that it will work better for you than what you have tried so far.
Comment 93 Thomas Backlund 2021-02-05 08:14:51 CET
(In reply to ian trump from comment #89)
> I am interested find out if it is a software or hardware problem.

It's atleast a broken bios issue, but it would be nice to not get stuck on that for ~90 sec.

A few years ago a similar issue was worked around upstream, so I'll check if that has been broken by later updates or if there is another codepath that needs the same fixups




(In reply to ian trump from comment #90)
> I have researched the amgpu driver and have found out what graphic cards are
> supported by it.
> 
> The cards nearest to my own are the AMD Radeon™ R9 285/290/290X Graphics
> that are supported.
>  
> My card is the Radeon R9 280X. I can only presume this is why I am having
> problems with my system if it is trying to use this driver.

Dont read too much into marketing names, there is way too many of them that gets branded differently, but they might still be the same family.

 
> This is the link where I have found this information. As my card is not part
> of this group of cards would it not be better that my system not use the
> amdgpu driver.
> 
> https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-45

Yes, but this is amd out of tree driver that usully contains different stuff compared to what is merged in upstream linux kernel that we rely on.


Anyway, a few more things to test:

do a:

echo "options amdgpu dc=0" >/etc/modprobe.d/amdgpu.conf
dracut -f


and reboot, does that help anything ?

if not, and you want to switch over to older radeon driver:
edit /etc/X11/xorg.conf
and change:
    Driver "amdgpu"
to:
    Driver "ati"

then do:
echo "options amdgpu si_support=0" >/etc/modprobe.d/amdgpu.conf
echo "options radeon si_support=1" >/etc/modprobe.d/radeon.conf
dracut -f

and reboot.

now it should boot up with older radeon/ati combo
Comment 94 Giuseppe Ghibò 2021-02-05 10:27:47 CET
Just wondering, can booting with "acpi_osi=" (https://www.kernel.org/doc/html/v5.10/firmware-guide/acpi/osi.html), can be helpful for the acpi timeouts due to the (buggy?) BIOS?
Comment 95 Dave Hodgins 2021-02-06 02:18:26 CET
For ACPI: \: failed to evaluate _DSM (0x1001)
https://github.com/torvalds/linux/commit/75fc70e07314347465c7df6d6b79535cf3db0e2a
appears to have a kernel fix.

CC: (none) => davidwhodgins

Comment 96 ian trump 2021-02-07 16:16:12 CET
Have installed and tested the rc and have some good and bad news.

Good news is the system is using the Module: ‎Card:ATI Volcanic Islands and later (amdgpu/fglrx) and it appears to be working.

The bad news is when trying to configure the graphical interface when installing the system I can see the Volcanic Islands module has been selected automatically and I can set the resolution, but the monitor how ever stays  with classic mode even when I select plug and play. When I go to test these parameters I am confronted with an error message telling me of a fatal server error and to try and change some parameters. I try change the parameters but this does not help. When then greeted with the screen to accept the changes I can see the parameters for various hardware including the HorizSync and VertRefresh for the monitor are all giving no parameters at all.

I carry on installing the system and then when the system reboots I can enter the GUI from the login screen. I reboot the system several times to make sure it is functioning OK and it is.

Next I go to the set up of the graphical server in the MCC. When I select the classic mode for the monitor I select plug and play this time my correct monitor designation is then selected.

I then go to test the configuration and the screen is then viewed with vertical coloured lines asking me if this is the correct setting to which I reply yes. 

I then went to reboot the system as I thought the graphical server was now set up, but when I entered my password and pressed the login the screen it froze and the system was not able to login. I could not even reboot from that GUI login screen so I did a  Ctrl+Alt+delete and the screen was active again but obviously not able to log into the system so I rebooted and still encountered the same problem.

I reinstalled the system again several times changing many parameters even selecting no when asking me if the setting were correct in the MCC but still no joy.

From what I can see here is probably one of many problems when I select the test function when setting the graphical server up in the MCC and select yes to confirm the setting are correct. Something has then changed the parameters so when I next boot into the system the screen freezes on the login page. 

There would also appear to be a problem setting the graphical server up in the installation process where I could not select plug and play for my monitor and the test function declared a fatal server error.
Comment 97 Morgan Leijström 2021-02-07 17:05:11 CET
I see this is entered at
https://wiki.mageia.org/en/Mageia_8_Errata#AMD.2FATI

CC: (none) => fri
Keywords: (none) => IN_ERRATA8

Comment 98 Giuseppe Ghibò 2021-02-07 17:17:03 CET
Why you have to deal with VertRefresh? What kind of output are you using? Digital? Analog? DP, HDMI, DVI, VGA?
Comment 99 Thomas Backlund 2021-02-07 17:37:31 CET
(In reply to ian trump from comment #96)
> Have installed and tested the rc and have some good and bad news.
> 
> Good news is the system is using the Module: ‎Card:ATI Volcanic Islands and
> later (amdgpu/fglrx) and it appears to be working.

Great.



> From what I can see here is probably one of many problems when I select the
> test function when setting the graphical server up in the MCC and select yes
> to confirm the setting are correct. Something has then changed the
> parameters so when I next boot into the system the screen freezes on the
> login page. 

If you skip the testing part, and only save the configuration, does it work then?

sometimes the testing part is known to not work correctly..

> 
> There would also appear to be a problem setting the graphical server up in
> the installation process where I could not select plug and play for my
> monitor and the test function declared a fatal server error.


Another question... 
Do you actually need to go and configure the display at all ?

many times the kernel and xserver autodetection should work oob without additional changes...


So is there something missing that makes you have to change configuration ?
Comment 100 ian trump 2021-02-07 19:09:49 CET
To be honest I have not had these problems in previous iterations of mageia when I have performed these proedures. I am under the imression these functions have been included in the MCC to fine tune the display configuration for the hardware.

The one thing I was changing mainly was the monitor desgnation because having the correct designation would tell the system what parameters that monitor was able to achieve, considering the capabilities of every monitor varies according to what types of technology they use.

You now say the  kernel and x server is done under autodetection and should work oob. If these fuctions are built into the MCC can you explain to me under what circumstances they should be used.


For the testing part I am now not using that fumction now because the system is now functional and able to reboot correctly . I was unaware this function is flaky and sometimes not able to work correctly. 

Thank you for pointing this anomaly out to me. Had I known this I would not have used it.

I am using HDMI for the input, but I can use DP as well.
Comment 101 Thomas Backlund 2021-02-07 19:22:37 CET
(In reply to ian trump from comment #100)
> To be honest I have not had these problems in previous iterations of mageia
> when I have performed these proedures. I am under the imression these
> functions have been included in the MCC to fine tune the display
> configuration for the hardware.
> 

well, they are more in place nowdays in case auto-detection does not work and you have to override it :)

> The one thing I was changing mainly was the monitor desgnation because
> having the correct designation would tell the system what parameters that
> monitor was able to achieve, considering the capabilities of every monitor
> varies according to what types of technology they use.
> 

Yeah, that used to be needed with older moditors, but most displays now happily automatically provide the info about what they support so kernel/gpu/display can select best mode by itself. especially when connected with hdmi or displayport


> You now say the  kernel and x server is done under autodetection and should
> work oob. If these fuctions are built into the MCC can you explain to me
> under what circumstances they should be used.
> 

they are a "fallback solution" in case autodetection does not set up what the user wants.

> 
> For the testing part I am now not using that fumction now because the system
> is now functional and able to reboot correctly . I was unaware this function
> is flaky and sometimes not able to work correctly. 
> 

Yeah, it has started showing up more and more now so we should probaby think about disabling/dropping the test function...

> Thank you for pointing this anomaly out to me. Had I known this I would not
> have used it.
> 

No worries.

> I am using HDMI for the input, but I can use DP as well.

good, so oob autodetection should be good.
HDMI and DP are way better than vga implementations on providing supported connection data
Comment 102 Morgan Leijström 2021-02-07 19:52:39 CET
For the graphics tool I have experienced it myself.

I now issued
Bug 28314 - Maybe remove the graphics testing function, it sometimes hurts...
Comment 103 ian trump 2021-02-07 20:38:48 CET
If this test function can be potentially disruptive as in my case here, then I can see no viable way it can be objectively used and fit for purpose.
Comment 104 ian trump 2021-02-07 20:53:43 CET
AMD/ATI southern and sea island cards apparently have issues.

My card is recognised as a volcanic island.

I am not at all familier with AMD/ATI card configurations.  My card has recently had issues with pre-production mageia 8 and I would like to know if my card is some how related to the cards with the said issues.
Comment 105 Thomas Backlund 2021-02-07 20:57:40 CET
(In reply to ian trump from comment #104)
> AMD/ATI southern and sea island cards apparently have issues.
> 
> My card is recognised as a volcanic island.
> 
> I am not at all familier with AMD/ATI card configurations.  My card has
> recently had issues with pre-production mageia 8 and I would like to know if
> my card is some how related to the cards with the said issues.


yeah, its a naming issue in the tools... 
as we now have changed to amdgpu, by default the text should probably state:

"Southern Islands and later"

So just keep the auto-selected "volcanic islands" for now
Comment 106 Thomas Backlund 2021-02-07 20:59:34 CET
I need to review the code to see if I can do the naming change without breaking installer / drakx this late in the release process
Comment 107 Morgan Leijström 2021-02-07 21:05:19 CET
Mid air

I spawned Bug 28315 - AMD GPU naming need update
Comment 108 Thomas Backlund 2021-02-07 22:42:31 CET
I've pushed:
ldetect-lst-0.6.26-1.mga8
drakx-kbd-mouse-x11-1.34-1.mga8


to Core Updates Testing

Theese are updated to identify the cards as:

AMD Southern Islands and later (amdgpu)


please test.
Comment 109 Thomas Andrews 2021-02-08 01:03:18 CET
(In reply to Thomas Backlund from comment #101)
> (In reply to ian trump from comment #100)

> 
> > The one thing I was changing mainly was the monitor desgnation because
> > having the correct designation would tell the system what parameters that
> > monitor was able to achieve, considering the capabilities of every monitor
> > varies according to what types of technology they use.
> > 
> 
> Yeah, that used to be needed with older moditors, but most displays now
> happily automatically provide the info about what they support so
> kernel/gpu/display can select best mode by itself. especially when connected
> with hdmi or displayport
> 
> 
Six monitors under my control here, three DVI or VGA(using DVI), one VGA-only, two laptops(probably VGA). Three are wide-screen (16:9 aspect), while the other three are the older 4:3 aspect. 

I always use "Plug & Play on each, and they "just work." Even if I have occasion to switch monitors, even if that involves changing aspect ratios, Mageia detects the change and switches to the correct settings with no intervention from me.

And that's just the way I like it.
Comment 110 Dave Hodgins 2021-02-08 02:26:18 CET
Before a lightning strike destroyed my old Mitsubishi Diamond Scan 20 crt
monitor, manual selection was required as it did not provide any edid info.

Otherwise it would use 640x480 while it supported 1280x1024 when properly
selected.
Comment 111 Thomas Andrews 2021-02-08 14:46:23 CET
Created attachment 12316 [details]
amd list in drakx11 from updated ldetect.lst

Possibly not a valid test, but I updated the requested packages using QA Repo with my HP Probook which doesn't use AMD graphics, because it was handy. I would think that even if the hardware doesn't include an AMD gpu, that drakx11 should show the expected text.

It doesn't. As you can see from the screenshot of the drakx11 window, the amdgpu choice still shows "Volcanic Islands and later."

I can't do it right now, but I will try the packages on the machine with the affected AMD card in a little while.
Comment 112 Lewis Smith 2021-02-08 16:08:45 CET
(In reply to Morgan Leijström from comment #102)
> For the graphics tool I have experienced it myself.
> I now issued
> Bug 28314 - Maybe remove the graphics testing function, it sometimes hurts...

(In reply to ian trump from comment #103)
> If this test function can be potentially disruptive as in my case here, then
> I can see no viable way it can be objectively used and fit for purpose.

Of course the video test may not work - that is normal within its context: 'might not'. It mostly does, of course. If greater wisdom deems to drop it, so be it.

Copying this comment to the new more specific bug
Comment 113 Thomas Andrews 2021-02-08 16:42:43 CET
(In reply to Thomas Andrews from comment #111)

> Possibly not a valid test, but I updated the requested packages using QA
> Repo with my HP Probook which doesn't use AMD graphics, because it was
> handy. I would think that even if the hardware doesn't include an AMD gpu,
> that drakx11 should show the expected text.
> 
> It doesn't. As you can see from the screenshot of the drakx11 window, the
> amdgpu choice still shows "Volcanic Islands and later."
> 
Ah, I see what happened, now that I've tried it with the machine with the affected card. With the Probook(Intel graphics), I scrolled to the "ATI" category, without going far enough to see there is also an "AMD" category. Sorry about that.

It works with both computers. On the AMD machine I went on through the process, and wound up with the amdgpu driver, with the chip identified as "Southern Islands and newer."
Comment 114 ian trump 2021-02-09 00:57:31 CET
I have updated and tested the new parameters tonight.In the set up for the graphical server within the MCC I can now see the "Southern Islands and newer (amdgpu)" designation under the AMD heading for the vendor.

The system would appear to function normally and reboots with out any problems.

Good job. 

Thank you.
Comment 115 Thomas Backlund 2021-02-12 13:22:13 CET
ok, seems to work as intended.

Closing as fixed

Resolution: (none) => FIXED
Status: NEW => RESOLVED

Comment 116 Morgan Leijström 2021-02-12 13:29:44 CET
Marking in errata as fixed
Do this fix also work on Live with persistence after update?
Comment 117 Thomas Andrews 2021-02-18 19:08:55 CET
(In reply to Thomas Andrews from comment #77)
> 
> I still have a 32-bit M7 Plasma install on this hardware that was affected,
> though the only symptom there seemed to be an unresponsive sddm login. I
> have switched the drivers, and the problem went away. Let me know when the
> upgrade script is ready to test, and the best way to do it, and I will
> revert that install to the old driver and attempt an upgrade. I can upgrade
> through a CI iso or the netinstall iso (which Martin has updated to work
> with WPA2 wifi encryption).

Tested an upgrade with the final (hopefully) test 32-bit CI iso. It worked as I believe is expected.

During the upgrade, when I came to the post-install configuration step, I checked for what graphics driver was to be used. It said "Custom" and I backed out with no changes, as I wanted to test what would happen in an upgrade where the user didn't check.

Once the upgrade was complete, I booted to a working desktop. A check of xorg.conf revealed that the card was identified as "radeon HD 6400 and newer" but the amdgpu driver is being used. After doing a bit of exploring, with no ill graphics effects, I checked drakx11. It still said "Custom" was being used, but if I clicked on that "Southern Islands and newer" was recommended. Once again I backed out with no changes, continuing to simulate a user who is trusting Mageia to "just work." 

If it stops working, I'll open a new bug.
Comment 118 Morgan Leijström 2021-02-23 11:03:42 CET
For Errata, is this issue fixed well enough so we can remove the whole section https://wiki.mageia.org/en/Mageia_8_Errata#AMD.2FATI at release?

Or should it remain in case some user need to do some trick.  Or rewrite - if so, how?
Comment 119 Thomas Andrews 2021-02-23 13:32:25 CET
No one has reported a problem here since it was declared fixed, and I haven't heard of any problems from anyone else, so I'd say it could be removed from errata.
Comment 120 Morgan Leijström 2021-02-23 13:51:25 CET
I was thinking maybe we still want feedback or help user of any case where it does not work; tmb wrote feb 5 in errata:

" ... want feedback if it works properly or not on your hardware.
If it does not work for you, you can fall back to the older driver by ... "

@Thomas B?
Comment 121 Morgan Leijström 2021-02-24 23:54:20 CET
For now leaving it in Errata, prepended with "FIXED(?)"
Comment 122 Morgan Leijström 2021-05-10 21:03:16 CEST
Actually i have now removed FIXED.
Some systems still get hit various ways, exemplified in errata by links to bugs.
Comment 123 gilles d 2022-01-19 09:38:56 CET
Hello,

This for a recent return:

I've switched to Mageia8 last week, with a clean install from the live media, plasma desktop, 64 bits. All the updates since have been done. 

I have a Bonaire graphics card on my old HP ZBook:
$ lspcidrake -v
Card:AMD Southern Islands and later (amdgpu): Advanced Micro Devices, Inc. [AMD/ATI]|Bonaire XT [Radeon R9 M280X] [DISPLAY_VGA] (vendor:1002 device:6646 subv:103c subd:2256)

Installation went smoothly and "everything" works so far, but the graphic card is "not supported":

$ dmesg|grep -i bonaire
[    6.110667] [drm] initializing kernel modesetting (BONAIRE 0x1002:0x6646 0x103C:0x2256 0x00).
[    6.110699] kfd kfd: amdgpu: BONAIRE  not supported in kfd

drakx11 shows the card as "AMD Southern Islands" which is correct. driver is amdgpu.

CC: (none) => gilles.duvert

Comment 124 gilles d 2022-01-19 09:54:54 CET
Sorry about last comment, my card (Bonaire) is definitely a "Sea Islands" and not a "Southern Island" and drakX11 has it wrong.
Graphic cards are just like washing machines or tvs, they just need a serial number and not those silly confusing ridiculous names.
Comment 125 gilles d 2022-01-19 10:26:19 CET
Aditionally, I've tried to reboot using "radeon.cik_support=1 amdgpu.cik_support=0" in the kernel command line

I get the blank screen initially described by comment 1.
Bonaire is supported...
$ dmesg|grep -i bonaire
[    7.592366] [drm] initializing kernel modesetting (BONAIRE 0x1002:0x6646 0x103C:0x2256 0x00).
[    7.592509] [drm] Loading bonaire Microcode

But the server fails to start, blank screen as described initially in this bug report.
Comment 126 gilles d 2022-01-19 10:46:07 CET
OK, concerning comment above, the procedure is not well described in the errata, that's all:
- kernel command line must favor old radeon:
"radeon.cik_support=1 amdgpu.cik_support=0"
- drakx11 must have been set to ATI radeon HD 4870 and earlier (at least this is the good choice for my old Bonaire)

Then everything's OK:
$ dmesg|grep -i bonaire
[    7.576670] [drm] initializing kernel modesetting (BONAIRE 0x1002:0x6646 0x103C:0x2256 0x00).
[    7.576810] [drm] Loading bonaire Microcode

$ cat /var/log/Xorg.0.log
(...)
[   144.018] (II) RADEON(0): glamor X acceleration enabled on AMD BONAIRE (DRM 2.50.0, 5.15.15-desktop-1.mga8, LLVM 11.0.1)
(...)

Still wondering why my Sea Islands card is not recognised automatically as the driver is there:

$ locate amdgpu/bonaire_ce.bin
/usr/lib/firmware/amdgpu/bonaire_ce.bin
Comment 127 Thomas Backlund 2022-01-19 12:09:39 CET
(In reply to gilles d from comment #124)
> Sorry about last comment, my card (Bonaire) is definitely a "Sea Islands"
> and not a "Southern Island" and drakX11 has it wrong.

Nope... it says Southern Island *and later*,
and Sea Islands is a later model (released 2013) than Southern Islands (released 2012)

unfortunately Amd have made a mess of their "gpu familys" so some gpus work, others not despite being the "same family"

That's why we made sure people can switch back to legacy drivers if the improved amdgpu driver does not work...

There is no nice "one solution fits all" at this point unless we start patching the kernel and splitting up amdgpu and radeon with "family based config options" but I'd rather not as that would be a pita to maintain.
Comment 128 Thomas Backlund 2022-01-19 12:23:58 CET
(In reply to gilles d from comment #123)
>
> [    6.110699] kfd kfd: amdgpu: BONAIRE  not supported in kfd
> 

And this does not matter for normal gpu usage.

AMDKFD is the AMD Kernel Fusion Driver (dating back to the days of AMD "Fusion") that is basically the AMD HSA compute driver within the kernel.

so it can for most users be ignored.
Comment 129 Thomas Backlund 2022-01-19 12:29:32 CET
in this case it's probably that the Zbook internally routes the display through legacy analog vga, something that amdgpu does not yet support... afaik there is some potential work in progress upstream but no ETA...
Comment 130 Thomas Andrews 2022-01-19 14:20:27 CET
(In reply to gilles d from comment #123)
> Installation went smoothly and "everything" works so far, but the graphic
> card is "not supported":
> 
> $ dmesg|grep -i bonaire
> [    6.110667] [drm] initializing kernel modesetting (BONAIRE 0x1002:0x6646
> 0x103C:0x2256 0x00).
> [    6.110699] kfd kfd: amdgpu: BONAIRE  not supported in kfd
> 
> drakx11 shows the card as "AMD Southern Islands" which is correct. driver is
> amdgpu.

I get the same message about the "Oland" card (Southern Islands HD 8570) that prompted this bug in the first place. I've been using it with Mageia 8 for months, mostly for testing potential updates, and it hasn't failed - yet.

That's not to say it won't fail with anything, only that I haven't run into that kind of situation.
Comment 131 Morgan Leijström 2022-01-19 14:55:17 CET
(In reply to gilles d from comment #126)
> OK, concerning comment above, the procedure is not well described in the
> errata, that's all:
> - kernel command line must favor old radeon:
> "radeon.cik_support=1 amdgpu.cik_support=0"
> - drakx11 must have been set to ATI radeon HD 4870 and earlier (at least
> this is the good choice for my old Bonaire)

Thanks for the feedback

Added a last line in chapter
https://wiki.mageia.org/en/Mageia_8_Errata#AMD.2FATI

Is that OK?
Or (when) should user not select that driver?

/mga8 errata monkey
Comment 132 gilles d 2022-01-19 15:42:04 CET
 (In reply to Morgan Leijström from comment #131)
> Added a last line in chapter
> https://wiki.mageia.org/en/Mageia_8_Errata#AMD.2FATI
> 
> Is that OK?
> Or (when) should user not select that driver?
First of all thanks for your return!

It seems difficult to give a precise and complete advice for all those cards, so I guess your change is OK. 
One can always test a new setting using drakx11 (in graphic and also console mode: perhaps it should be reminded in case of failing X11 that CTRL+ALT+F2 opens a console on which drakx11 can be called and drivers "tested")
Comment 133 Morgan Leijström 2022-01-19 15:54:47 CET
CtrlAltF2 tip implemented.

Note You need to log in before you can comment on or make changes to this bug.