Description of problem: After an update of the kernel to 5.15.4, the fans of my AMD RX570 graphic card don't rotate anymore. They do with 5.10.78. I was in fact alerted my psensor when I saw the fan speed was stuck at 1791rpm. I restarted with kernel 5.10.78 and the speed was around 930rpm. Then, to make sure it is not a wrong information sent by psensor, I opened the side of case to confirm the fans were like crazy. But, in fact, the fans were not rotating at all! I will attach psensor graphs + pictures of the fan. My DE is Plasma, but I am not sure it has an impact. Version-Release number of selected component (if applicable): kernel-userspace-headers-5.15.4-1 How reproducible: always Steps to Reproduce: 1.Update kernel to latest 5.15.4 2.Log in and look at the fan 3. Information on my hardware: Machine: Type: Desktop System: ASUS product: N/A v: N/A serial: <superuser required> Mobo: ASUSTeK model: TUF GAMING B550M-PLUS v: Rev X.0x serial: <superuser required> UEFI: American Megatrends v: 2423 date: 08/10/2021 CPU: Info: 12-Core AMD Ryzen 9 5900X [MT MCP] speed: 4529 MHz min/max: 2200/3700 MHz Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] driver: amdgpu v: kernel Display: x11 server: Mageia X.org 1.20.12 driver: amdgpu,v4l resolution: 2560x1440~60Hz OpenGL: renderer: Radeon RX 570 Series (POLARIS10 DRM 3.40.0 5.10.78-desktop-1.mga8 LLVM 11.0.1) v: 4.6 Mesa 21.2.4 I know this motherboard is impacted by a kernel bug: https://bugzilla.kernel.org/show_bug.cgi?id=204807 Maybe it could give some hints? Grub options, per CCM: splash quiet noiswmd resume=UUID=5da5b07f-0b10-431c-9012-1562f1bb3dfb audit=0
Created attachment 13006 [details] GPU fan speed and temperature with kernel 5.15.4 Same load on the processor. Same applications active. It looks like a speed demand is sent to the graphic card but the fans don't react?
Created attachment 13007 [details] video of the fan not working with kernel 5.15.4 Clearly, the fan doesn't rotate.
Created attachment 13008 [details] GPU fan speed and temperature with kernel 5.10.78 GPU fan speed (purple curve) is at a usual value around 930rpm, when not heavy loaded.
Created attachment 13009 [details] video of the fan spinning with kernel 5.10.78 With kernel 5.10.78, under the same load, the fan is clearing spinning, as intended.
Do you see something in system journal that may be relevant? FWIW i see kernel 5.15.5 in testing repo Assigning to maintainers
Assignee: bugsquad => kernelCC: (none) => fri
Created attachment 13010 [details] journalctl of the boot and login with kernel 5.15 Nothing strikes me but I might not be knowledgeable enough on what to look for.
Source RPM: kernel-userspace-headers-5.15.4-1.src.mga8 => kernel-desktop-latest-5.15.4-1.src.mga8
Hi just tested the 5.15.5 kernel from Testing repo; same issue, sadly.
please provide journal from booting with 5.10 series too
Created attachment 13011 [details] 5.10.78 boot journal journaltcl output while booting with 5.10.78, upon Thomas' request
are you sure this is actually a problem ? one difference between 5.10 and 5.15 is that 5.15 does runtime power management so technically the fans dont need to spin unless the load and temperature rises high enough does the fan spin if you add amdgpu.runpm=0 on kernel command line ? as for the referenced motherboard "issue", I will backport the support that has landed in 5.16-rc2 to next kernel build..
(In reply to Thomas Backlund from comment #10) > are you sure this is actually a problem ? > > one difference between 5.10 and 5.15 is that 5.15 does runtime power > management so technically the fans dont need to spin unless the load and > temperature rises high enough > > does the fan spin if you add amdgpu.runpm=0 on kernel command line ? > > as for the referenced motherboard "issue", I will backport the support that > has landed in 5.16-rc2 to next kernel build.. * amdgpu.runpm=0 doesn't change anything. Fan spins with 5.10 and doesn't with 5.15 * regarding the power management, what I don't explain then is why the fan speed shows ~1800rpm and nothing happens. The psensor snapshot shows as well the GPU temperature. It starts at 35C and keeps increasing till 40C (I took the snapshot at that time). Still no reaction from the fan; where could I find when the fan should really then start cranking up to find out whether it could be a power management thing?
Is it addressing the wrong fan?
(In reply to christian barranco from comment #11) > * amdgpu.runpm=0 doesn't change anything. Fan spins with 5.10 and doesn't > with 5.15 > ok... > * regarding the power management, what I don't explain then is why the fan > speed shows ~1800rpm and nothing happens. That might simply be the sensor not being to read correct value, so it shows some default value... > The psensor snapshot shows as well the GPU temperature. It starts at 35C and > keeps increasing till 40C (I took the snapshot at that time). Still no > reaction from the fan; where could I find when the fan should really then > start cranking up to find out whether it could be a power management thing? For example on my MSI GPU, by design the fans dont start until the gpu hits 60C in order to provide a "silent mode / experience"
(In reply to Morgan Leijström from comment #12) > Is it addressing the wrong fan? (In reply to Thomas Backlund from comment #13) >That might simply be the sensor not being to read correct value, so it shows some default value... It is the same sensor code between 5.10 and 5.15 In that case, something would have changed with 5.15 in the "directory" connecting the fan to an entry. I re-ran sensors-detect with 5.15. So, I assume the fan should be the right one and I don't have any other proposal, anyway. >For example on my MSI GPU, by design the fans dont start until the gpu hits 60C in order to provide a "silent mode / experience" In that case, does it mean kernel 5.10 would be overulling the hardware logic and be starting the fan?
I just did another, stressing my GPU. Actually, the fans started to spin when the GPU temp reached about 52C. They stopped again when the temperature went down to about 43C. I did the test twice, and you will see on the picture I will attach. However, the fan speed is completely off and remains incoherent though. Could it be connected to the bug you will patch? Note: by the way, thank you so much Thomas for applying this patch at the next kernel release! :)
Created attachment 13012 [details] fan starts at some point when the GPU temp increases
I would guess that the fan go autonomously in case the software does not work. Interestingly the GPU fan rpm reading diagram react when fan go on/off. It may show correct speed when fan is on, but some fake value when fan is off.
(In reply to christian barranco from comment #14) > (In reply to Morgan Leijström from comment #12) > > Is it addressing the wrong fan? > > (In reply to Thomas Backlund from comment #13) > >That might simply be the sensor not being to read correct value, so it shows some default value... > It is the same sensor code between 5.10 and 5.15 > In that case, something would have changed with 5.15 in the "directory" > connecting the fan to an entry. > I re-ran sensors-detect with 5.15. So, I assume the fan should be the right > one and I don't have any other proposal, anyway. > It's probably the fact that acpi is getting stricter, so the sensor code cant access it unless booting with "acpi_enforce_resources=lax" > >For example on my MSI GPU, by design the fans dont start until the gpu hits 60C in order to provide a "silent mode / experience" > In that case, does it mean kernel 5.10 would be overulling the hardware > logic and be starting the fan? It basically means the amdgpu in 5.10 did not fully support your hw regarding runtime pm, and in that case it runs the fans all the time, as otherwise it would fry the hw if fans would never start when really needed...
(In reply to Thomas Backlund from comment #18) > It basically means the amdgpu in 5.10 did not fully support your hw > regarding runtime pm, and in that case it runs the fans all the time, as > otherwise it would fry the hw if fans would never start when really needed... This is backed by following information that Kernel 5.15 provides better support for several AMD products: https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.15-AMD Having the fans run all the time at a GPU (which supports the switch off of fans when below a defined temperature) looks like bad support of driver/kernel. That means, with Kernel 5.15 the fan control at this graphic card works now how it should be.
The control seem OK, but not the rpm readout.
(In reply to Morgan Leijström from comment #20) > The control seem OK, but not the rpm readout. Yes, I agree. I will monitor the release of the kernel with the patch. Should this report be renamed « wrong sensor value read out with kernel 5.15 »?
There is now a kernel-5.15.5-2.mga8 in updates testing with the added nct6775 patches
(In reply to Thomas Backlund from comment #22) > There is now a kernel-5.15.5-2.mga8 in updates testing with the added > nct6775 patches Thanks, you rock! I have now system fan speeds and other motherboard temperatures I was missing with 5.10.x Unfortunately, still, the AMD GPU fan speed exhibits the same awkward behavior…
Severity: major => normal
I guess this report can be closed now?
Normal behavior. The inacurate fan speed is another story.
Resolution: (none) => WONTFIXStatus: NEW => RESOLVED