Bug 26599 - Each kernel update breaks Nvidia drivers making the machine to freeze on next boot
Summary: Each kernel update breaks Nvidia drivers making the machine to freeze on next...
Status: RESOLVED OLD
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: 7
Hardware: All Linux
Priority: Normal critical
Target Milestone: ---
Assignee: Kernel and Drivers maintainers
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-07 21:43 CEST by Augier
Modified: 2021-09-07 14:10 CEST (History)
1 user (show)

See Also:
Source RPM: bumblebee-nvidia
CVE:
Status comment:


Attachments
Log of installation of bumblebee-nvidia (12.96 KB, text/plain)
2020-05-07 22:06 CEST, Augier
Details

Description Augier 2020-05-07 21:43:30 CEST
Description of problem:

After **every single** kernel update, the Nvidia drivers seems to breaks so that the machine can't be started at all after next boot. The whole system just freeze just before displaying the login screen.

the only solution to fix the problem is to pass to recovery mode, uninstall bumblebee-nvidia with dependencies (urpme --auto-orphans) to get the machine to normally boot again. This is insanely annoying (to stay polite) as it happen to each and every kernel boot.

Version-Release number of selected component (if applicable):

Mageia 7. Just happened the afternoon after an update of the whole machine (kernel-desktop-5.6.6-1.mga7 to kernel-server-5.6.8-1.mga7).
Comment 1 Augier 2020-05-07 22:05:37 CEST
Addendum:

As of now, completely uninstalling Bumblebee and dependencies and reinstall won't work on kernel-server-5.6.8-1.mga7. I attached a log of installation. I notice compilation errors during install:

```
make mrproper....(bad exit status: 2)
using /proc/config.gz
make oldconfig....(bad exit status: 2)
make prepare....(bad exit status: 2)
```

The only solution I have for now is to downgrade to kernel-desktop-5.6.6-1.mga7.

I would gladly use Nvidia drivers **without** Bumblebee and Optimus (my machine is a laptop with an Nvidia GT940M) and bypass the Intel graphic chipset. But I can't find a clear documentation on how to do so on Mageia...
Comment 2 Augier 2020-05-07 22:06:10 CEST
Created attachment 11621 [details]
Log of installation of bumblebee-nvidia
Comment 3 Augier 2020-05-07 22:20:20 CEST
Ok, now I realize that I already met this error 3 years ago. I even documented it at the time: https://wiki.mageia.org/en/Bumblebee#Freeze_during_boot.

After editing the kernel options, everything seems to work fine.
Comment 4 Morgan Leijström 2020-05-08 00:44:10 CEST
Nice find, and interesting read at the link you gave there:
https://github.com/Bumblebee-Project/Bumblebee/issues/764

CC: (none) => fri

Comment 5 Lewis Smith 2020-05-08 20:55:52 CEST
Apologies for your angst.

(In reply to Augier from comment #3)
> Ok, now I realize that I already met this error 3 years ago. I even
> documented it at the time:
>  https://wiki.mageia.org/en/Bumblebee#Freeze_during_boot.
> After editing the kernel options, everything seems to work fine.
Thanks for this pointer. You originally reported this in November 2016 Bug 19873 for Mageia 5, closed 2y later as unconfirmed that it remained a problem. So it seems to have disappeared at some point between times. It was theoretically possible to re-open that old bug, and make this a duplicate of it. But things have advanced a lot since then, so leaving this new one active.

At that time, the problem could be worked around by certain kernel options. Your earlier note cited "acpi_osi=! acpi_osi="Windows 2009", other possible ones are mentioned in the linked bug. Have you tried any kernel options this time?

>  from comment 0:
> After **every single** kernel update,
>  from comment 1:
> The only solution I have for now is to downgrade to
> kernel-desktop-5.6.6-1.mga7
Or choose that from the Grub2 boot menu 'advanced' line. If a new kernel introduces a problem, you are not obliged to use it or uninstall it - so long as you leave the earlier one in place.
The problem seems to have re-surfaced after a kernel update. You imply that 5.6.6-1 was/is OK; yet the only update since seems to have been 5.6.8-1. Just one, which is at odds with "every single kernel update" (unless I have misunderstood something).

Assigning this to the kernel team.

CC: (none) => lewyssmith
Assignee: bugsquad => kernel

Comment 6 Augier 2020-05-08 21:55:35 CEST
> Apologies for your angst.

No, no, it's my fault, really. I assumed it was a problem of untested updates releases when it was just me unable to read the documentation correctly and rediscoveing, 3y later a bug I documented myself. I should know better as I'm a programmer myself and former packager. So, my apologies again.

> You originally reported this in November 2016 Bug 19873 for Mageia 5, closed 2y later as unconfirmed that it remained a problem.

What happned is that using the `acpi_osi=! acpi_osi="Windows 2009"` kernel option solved the problem back then. Then I moved to Mageia 6 and didn't reinstalled Nvidia ( I don't really play video games much). So I didn't confirm the bug was still present on Mageia 6 before the ticket was closed.

It's until recently that I got back to playing video games on that machine and, thus, tried again to make the Nvidia card to work. I first tried the magiea-prime package that the Mageia 7 release notes say to be experimental.

With that tool, I faced the same bug I describe here with bumblebee-nvidia: the Nvidia drivers work correctly at first, but then break with the next kernel update.

There is a main difference with the bug of 3y ago, though: back then, it didn't work at all without using the acpi_osi kernel option when, now, it seems to work on a fresh Mageia 7 install until the next kernel update.

Using `acpi_osi=! acpi_osi="Windows 2009"` still solves the problem with both the bumblebee-nvidia package and mageia-prime (which is what I use now).

> Have you tried any kernel options this time?

Still using `acpi_osi=! acpi_osi="Windows 2009"` and it still works. The machine is the same: Clevo N550RN i7-6700HQ HD Graphics 530 + NVidia GeForce 940MX.

Since this kernel option fixes the problem, I didn't try to reinstall kernel-desktop-5.6.6-1.mga7.

> The problem seems to have re-surfaced after a kernel update. You imply that 5.6.6-1 was/is OK; yet the only update since seems to have been 5.6.8-1. Just one, which is at odds with "every single kernel update" (unless I have misunderstood something).

Yes. That's because I tried to install the Nvidia drivers at least two times in the last few months. I first tried mageia-prime. Worked fine at first until the following kernel update. Then I got back to recovery mode, uninstalled it completely (with dependencies using urpme --auto-orphans). And reinstalled it at least once. Then it worked again.

After that, I don't clearly remember. There was a second kernel update and it stopped working again (same bug: freeze on boot). I don't remember if I tried to reinstall a third time and made it work again. But eventually, I couldn't make mageia-prime to work at all.

Then, I completely erased / and reinstalled Mageia 7 from scratch. It was 2 weeks ago. I installed bumblebee-nvidia just last week. And the same happened again:  worked correctly on kernel-desktop-5.6.6-1 and started boot freeze after update to kernel-server-5.6.8-1. Multiple full uninstalls and reinstalls later, still couldn't make it work.

I posted that very angry message, completely removed bumblebee-nvidia and then rediscovered the https://wiki.mageia.org/en/Bumblebee#Freeze_during_boot section.

After that, I added the `acpi_osi=! acpi_osi="Windows 2009"` kernel option, reinstalled mageia-prime and now, works again.
Comment 7 Lewis Smith 2020-05-09 21:03:07 CEST
Thank you for the clarification. It looks like on a fresh install with whatever kernel, it works; the first kernel update to arrive subsequently breaks it. This suggests to me (unless I have got something wrong) that it is the kernel update process that breaks it, not the actual kernel version.

Anyhow, for the moment you are back with your own fix. Good.
Hope tmb will look at this. I am signing off, can do no more.

> I'm a programmer myself and former packager.
What about coming back. We are hard pressed.

CC: lewyssmith => (none)

Comment 8 Aurelien Oudelet 2021-07-06 13:15:57 CEST
Mageia 7 is EOL since July 1st 2021.
There will not have any further bugfix for this release.

You are encouraged to upgrade to Mageia 8 as soon as possible.

@reporter, if this bug still apply with Mageia 8, please let us know it.

@packager, if you work on the Mageia 7 version of your package, please check the Mageia 8 package if issue is also present. In this case, please fix the Mageia 8 version instead.

This bug report will be closed OLD if there is no further notice within 1st September 2021.
Comment 9 Marja Van Waes 2021-09-07 14:10:07 CEST
Hi bug reporter and hi assignee and others involved,

Please reopen this bug report if it is still valid for Mageia 8 or 9(cauldron), and change "Version:" in the upper left of this report accordingly.

This report is being closed as OLD because it was filed against Mageia 7, for which  support ended on June 30th 2021.

Thanks,
Marja

Status: NEW => RESOLVED
Resolution: (none) => OLD


Note You need to log in before you can comment on or make changes to this bug.