Bug 32565 - Update request: nvidia-newfeature-545.29.06-2.mga9.nonfree
Summary: Update request: nvidia-newfeature-545.29.06-2.mga9.nonfree
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: 9
Hardware: All Linux
Priority: High major
Target Milestone: ---
Assignee: QA Team
QA Contact:
URL:
Whiteboard: MGA9-64-OK
Keywords: advisory, validated_update
Depends on:
Blocks:
 
Reported: 2023-11-24 20:09 CET by Marja Van Waes
Modified: 2024-03-06 05:23 CET (History)
7 users (show)

See Also:
Source RPM:
CVE:
Status comment: RPMs in #40


Attachments
journalctl about nvidia (26.26 KB, text/plain)
2023-12-18 17:03 CET, Brian Rockwell
Details

Description Marja Van Waes 2023-11-24 20:09:08 CET Comment hidden (obsolete)
Comment 1 Marja Van Waes 2023-11-24 20:23:52 CET
(In reply to Marja Van Waes from comment #0)

> Is this the correct link?:
> https://www.nvidia.com/Download/driverResults.aspx/214100/en-us/

I think it is, but there is a newer version available now:  	545.29.06
https://www.nvidia.com/Download/driverResults.aspx/216530/en-us/
Comment 2 Giuseppe Ghibò 2023-11-24 20:25:30 CET
Yes 545.29.06 is the newer version, better to report both highlights.
Comment 3 Marja Van Waes 2023-11-24 21:58:19 CET
Thx, Joeghi,

New advisory and package list:

This update provides the 545.29.06 upstream version.

RPMs:

dkms-nvidia-newfeature-545.29.06-1.mga9.nonfree.x86_64.rpm
nvidia-newfeature-all-545.29.06-1.mga9.nonfree.x86_64.rpm
nvidia-newfeature-cuda-opencl-545.29.06-1.mga9.nonfree.x86_64.rpm
nvidia-newfeature-devel-545.29.06-1.mga9.nonfree.x86_64.rpm
nvidia-newfeature-doc-html-545.29.06-1.mga9.nonfree.x86_64.rpm
nvidia-newfeature-lib32-545.29.06-1.mga9.nonfree.x86_64.rpm
nvidia-newfeature-utils-545.29.06-1.mga9.nonfree.x86_64.rpm
x11-driver-video-nvidia-newfeature-545.29.06-1.mga9.nonfree.x86_64.rpm

From SRPM:

nvidia-newfeature-545.29.06-1.mga9.nonfree

References:
https://www.nvidia.com/Download/driverResults.aspx/214100/en-us/
https://www.nvidia.com/Download/driverResults.aspx/216530/en-us/

Summary: Update request: nvidia-newfeature-545.29.02-1.mga9.nonfree => Update request: nvidia-newfeature-545.29.06-1.mga9.nonfree

Comment 4 Marja Van Waes 2023-11-24 22:07:01 CET
Advisory from comment 3 added to SVN. Please remove the "advisory" keyword if it needs to be changed. It also helps when obsolete advisories are tagged as "obsolete"

Keywords: (none) => advisory

Comment 5 Morgan Leijström 2023-11-26 21:45:35 CET
GPU= nvidia GTX750

Runs fine with kernel desktop 6.5.11-5, but booting linus 6.5.11-2, i just get black screen instead of DM login.

Tried to uninstall and reinstall, same.

After failed session with linus kernel, booting desktop and looking in journal, there are lines with "failed" that are not in the working boot:

$ sudo journalctl -b-1 | grep failed
nov 26 20:30:13 svarten.tribun (udev-worker)[5244]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \  -f 4); do /usr/bin/test -c /dev/nvidia${i} || /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c 195 ${i}; done'' failed with exit code 1.
nov 26 20:30:13 svarten.tribun (udev-worker)[5088]: nvidia: Process '/sbin/modprobe nvidia-modeset' failed with exit code 1.
nov 26 20:30:13 svarten.tribun (udev-worker)[5088]: nvidia: Process '/sbin/modprobe nvidia-drm' failed with exit code 1.
nov 26 20:30:16 svarten.tribun wireplumber[5621]: GetManagedObjects() failed: org.freedesktop.DBus.Error.NameHasNoOwner
nov 26 20:31:05 svarten.tribun xdg-desktop-portal-kde[11772]: This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.


The previous newfeature version in testing was working with these kernels tested in updates plus some elder and some experimental from Giuseppe.

CC: (none) => fri

Comment 6 Giuseppe Ghibò 2023-11-26 22:17:52 CET
Usually when an nvidia module can't be loaded could mean that it's not correctly built or was built by an older or different kernel version (it shouldn't happen but maybe could) or the nouveau module is already preloaded at that point and thus interferes with nvidia. Another possible cause is that for some reasons the update-alternatives scripts somewhere reverts the file /etc/modprobe.d/display-driver.conf (which is a link to /etc/nvidia-current|newfeature/modprobe.conf) and thus the modules can't be loaded.

To set that update-alternatives manually, one can run "update-alternatives --set gl_conf /etc/nvidia-newfeature/ld.so.conf" or "update-alternatives --set gl_conf /etc/nvidia-current/ld.so.conf".

Also to test modules, boot in non graphical target (level 3 or 1), then try locate the nvidia modules (nvidia-newfeature|current.ko.xz|gz) and run modinfo on them to check it's built for the correct kernel version. If so try to modprobe the nvidia module before going into graphical target.

To further ensure the nouveau modules is not automatically preloaded try to add nouveau.modeset=0 at boot cmdline.

Also nvidia boot should have the 'nokmsboot' in kernel boot cmdline. So one might check this also.

I wish to have something much more robust automatically, but it's not an easy task to achieve.
Comment 7 Morgan Leijström 2023-11-27 12:18:38 CET
Iĺl see later.

I attempted now install elder kernel: why do urpmi refuse to install the suitable -devel- package?

$ LC_ALL=C sudo urpmi kernel-linus-devel-6.4.9-1.mga9
A requested package cannot be installed:
kernel-linus-devel-6.4.9-1.mga9.x86_64 (in order to keep kernel-linus-devel-6.5.11-2.mga9.x86_64)
Comment 8 Giuseppe Ghibò 2023-11-27 12:50:43 CET
(In reply to Morgan Leijström from comment #7)

> Iĺl see later.
> 
> I attempted now install elder kernel: why do urpmi refuse to install the
> suitable -devel- package?

From what I could see that's probably a side-effect of the newer multi-versioned scheme on the same RPM, probably because it founds another package already installed of the same name, but newer version, and works only forward, not backward. An alternative is to bypass urpmi and install with something "rpm -i ./package-6.4.9...x86_64.rpm"
Comment 9 Morgan Leijström 2023-11-27 12:58:37 CET
Confirming.

It seems urpmi should be updated if we are to keep the new scheme.

Should I open a bug?
Comment 10 Morgan Leijström 2023-11-27 13:08:42 CET
BTW, same problem for the kernel packages

$ LC_ALL=C sudo urpmi kernel-desktop-6.4.9-4.mga9
A requested package cannot be installed:
kernel-desktop-6.4.9-4.mga9.x86_64 (in order to keep kernel-desktop-6.5.11-5.mga9.x86_64)
Comment 11 Morgan Leijström 2023-11-27 15:22:11 CET
After having uninstalled linus 6.5.11-2, and then installing both kernel-linus-6.4.9-1 and kernel-linus-6.5.11-2 and the respective devel packages, all done  using urpm{e,i}, both those kernels works too with this 545.29.06

I did not touch (nor check, sorry) anything else.

So:

1) newfeature-545.29.06-1 is OK for my system <<-------------- OK Here

2) installing a kernel with driver already installed is OK

3) then rebooting into a kernel where driver module was not built yet, makes dkms-autorebuild do its work OK.

4) seem to be some problem with drakx11 when switching driver for the running kernel. Now i do not remember if it was switching from modesetting or another proprietary, but i think it was 535.  I Could open a bug, but... time for me and you...  Additionally, this is a bit irritating: Bug 32571 - Graphics driver fallback switching failures
Comment 12 Morgan Leijström 2023-11-27 15:32:34 CET
We need more testers...

@Brian, i see you have nvidia 1050;
Can you test the three testing proprietary drivers?
All three should work on that GPU too.

CC: (none) => brtians1

Comment 13 Giuseppe Ghibò 2023-11-27 15:40:25 CET
(In reply to Morgan Leijström from comment #11)

> 
> 4) seem to be some problem with drakx11 when switching driver for the
> running kernel. Now i do not remember if it was switching from modesetting
> or another proprietary, but i think it was 535.  I Could open a bug, but...
> time for me and you...  Additionally, this is a bit irritating: Bug 32571 -
> Graphics driver fallback switching failures

IMHO that's a weakness that has been there since a long time and that's not related to kernel or nvidia, but rather to more than one package including drakx11 and dkms or maybe all together, and was mitigated only recently when we were able to get nvidia working with 5 series of drivers, but not completely foolproof. Seems it arises when changing many drivers and many kernels very often in a row. Probably it's there since we were no longer providing prebuilt kmod packages due to license limits (which happened around mga5 or mga6).

As a general advice, the only way to ensure it's not due to nvidia driver itself, is to ensure that nouveau kernel modules is not loaded, that before the nvidia drivers are probed either the nvidia modules are correctly built for the running kernels and that drakx11 hasn't automatically switched back the update-alternatives and xorg.conf to something other (/usr/sbin/update-alternatives --display gl_conf should show the current links).
Comment 14 Morgan Leijström 2023-11-27 16:49:29 CET
Giuseppe Ghibò from comment #13)
> (In reply to Morgan Leijström from comment #11)
> > 4) seem to be some problem with drakx11 when switching driver

I agree this is not a problem of neither kernel nor nvidia.

Thanks for the tips, maybe I or someone else will check deeper next time.
Comment 15 Morgan Leijström 2023-11-28 01:20:56 CET
(In reply to Giuseppe Ghibò from comment #13)
> (In reply to Morgan Leijström from comment #11)
> > 
> > 4) seem to be some problem with drakx11 when switching driver for the
> > running kernel.

> Seems it arises when changing many drivers and many
> kernels very often in a row.

No, it seem very easy to trig:

Now i had only three kernels installed, and nvidia545 in use in the running kernel. Then used drakx11 to change to nvidia470

Reboot: fail.

Rebooted to runlevel 3 and grabbed output from
 /usr/sbin/update-alternatives --display gl_conf :

gl_conf - status is manual.
 link currently points to /etc/nvidia470/ld.so.conf
/etc/ld.so.conf.d/GL/standard.conf - priority 500
 follower nvidia-settings.xinit: (null)
 follower display-driver.conf: (null)
/etc/nvidia470/ld.so.conf - priority 9700
 follower nvidia-settings.xinit: /etc/nvidia470/nvidia-settings.xinit
 follower display-driver.conf: /etc/nvidia470/modprobe.conf
Current `best' version is /etc/nvidia470/ld.so.conf.



Rebooting to another kernel, module got built and it works.


I think we need to present users how to most easily fix i when it hit, in errata.  And of course fix it... in a separate bug.
Comment 16 Giuseppe Ghibò 2023-11-28 01:38:53 CET
(In reply to Morgan Leijström from comment #15)
> (In reply to Giuseppe Ghibò from comment #13)
> > (In reply to Morgan Leijström from comment #11)
> > > 
> > > 4) seem to be some problem with drakx11 when switching driver for the
> > > running kernel.
> 
> > Seems it arises when changing many drivers and many
> > kernels very often in a row.
> 
> No, it seem very easy to trig:
> 
> Now i had only three kernels installed, and nvidia545 in use in the running
> kernel. Then used drakx11 to change to nvidia470
> 
> Reboot: fail.
> 
> Rebooted to runlevel 3 and grabbed output from
>  /usr/sbin/update-alternatives --display gl_conf :
> 
> gl_conf - status is manual.
>  link currently points to /etc/nvidia470/ld.so.conf
> /etc/ld.so.conf.d/GL/standard.conf - priority 500
>  follower nvidia-settings.xinit: (null)
>  follower display-driver.conf: (null)
> /etc/nvidia470/ld.so.conf - priority 9700
>  follower nvidia-settings.xinit: /etc/nvidia470/nvidia-settings.xinit
>  follower display-driver.conf: /etc/nvidia470/modprobe.conf
> Current `best' version is /etc/nvidia470/ld.so.conf.
> 
> 
> 
> Rebooting to another kernel, module got built and it works.
> 
> 
> I think we need to present users how to most easily fix i when it hit, in
> errata.  And of course fix it... in a separate bug.

I guess -devel were all previously installed.

So it fails because at next boot the dkms modules were not (yet) built at the stage where it switches to the graphical target?
Comment 17 Morgan Leijström 2023-11-28 16:32:54 CET
Continued in separate bug:

Bug 32579 - drakx11 switching nvidia driver makes boot command line wrong for running kernel, boots to black screen.
Morgan Leijström 2023-11-28 18:01:18 CET

Depends on: (none) => 32579

Comment 18 Morgan Leijström 2023-11-29 14:34:20 CET
> (In reply to Morgan Leijström from comment #7)

> > I attempted now install elder kernel: why do urpmi refuse

-> Bug 32582 - urpmi refuse to install elder kernel, drakrpm do not even show it (only -devel-)
Comment 19 Brian Rockwell 2023-12-18 00:17:55 CET
I'm trying to install nvidia new feature, but am blocked by dependencies of the already installed.

I'm getting this which prevents the other pieces from getting installed.

- nvidia-newfeature-utils-545.29.06-1.mga9.nonfree.x86_64 (due to conflicts with nvidia-current-utils-535.146.02-1.mga9.nonfree.x86_64, due to conflicts with nvidia-current-utils-535.146.02-1.mga9.nonfree.x86_64)

This is one example.
Comment 20 Morgan Leijström 2023-12-18 12:12:45 CET
That is OK, you can only have one branch of nvidia installed at the same time.

nvidia470 / nvidia-current / nvidia-newfeature


$ LC_ALL=C sudo urpmi --test x11-driver-video-nvidia-newfeature
The following packages have to be removed for others to be upgraded:
dkms-nvidia-current-535.146.02-1.mga9.nonfree.x86_64
 (due to conflicts with dkms-nvidia-newfeature)
nvidia-current-doc-html-535.146.02-1.mga9.nonfree.x86_64
 (due to conflicts with nvidia-newfeature-doc-html)
nvidia-current-utils-535.146.02-1.mga9.nonfree.x86_64
 (due to conflicts with nvidia-newfeature-utils)
x11-driver-video-nvidia-current-535.146.02-1.mga9.nonfree.x86_64
 (due to unsatisfied nvidia-current-utils == 535.146.02-1.mga9.nonfree)
(test only, removal will not be actually done) (y/N)

Switching is supposed to be automatic using drakx11, but it sometimes fail for the running kernel, bug 32579, reason not yet found.
Comment 21 Morgan Leijström 2023-12-18 15:24:59 CET
Updated in bug 32579: seem not to be a quirk in drakx11, as even

urpmi x11-driver-video-nvidia-newfeature

Fail getting all details correct for the running kernel
bug 32579 comment 5
Comment 22 Giuseppe Ghibò 2023-12-18 16:09:53 CET
Try to run this command (as root) to see if there are multiple modules:

find /var /usr/lib/modules -type f -name '*modeset*ko*' -exec sh -c "modinfo {} | head -9; echo \"=========\"" \;
Comment 23 Brian Rockwell 2023-12-18 16:49:18 CET
MGA9-64, Gnome, Nvidia 1050GT, Ryzen 5600

Tried the basics for this and it failed to start a graphical login screen.  Able to go to terminal and switch back to nouveau.

I tried to command line install nvidia-current-utils - that failed with dependency.

Tried restarting with 6.4 kernel.  It tried to build the driver, then went to same blank screen

I think this needs some additional dependency check
Comment 24 Brian Rockwell 2023-12-18 17:03:24 CET
Created attachment 14226 [details]
journalctl about nvidia
Comment 25 Morgan Leijström 2023-12-18 17:20:10 CET
(In reply to Giuseppe Ghibò from comment #22)
> Try to run this command (as root) to see if there are multiple modules:

See bug bug 32579 comment 8 (and the run up to there)

As you see it happens when switching to nvidia470 too, 
- probably between any nvidia driver.

---

@Brian
The install and switching between nvidia "branches" works perfectly here
(except that unknown error for the running kernel)
by installing the x11-driver, like in comment 20 but without "--test" flag and answering Y:
 urpmi x11-driver-video-nvidia-newfeature

replace newfeature with other variants to try switching to the other branches
Does that not work for you?
Comment 26 Brian Rockwell 2023-12-18 17:22:16 CET
yes - I uninstalled 545 and went back to 535 without any issues.

But to recover, reminding folks you can get back to working by using # XFdrake and pointing back to Nouveau to get graphics back.
Comment 27 Morgan Leijström 2023-12-18 17:27:02 CET
(In reply to Brian Rockwell from comment #26)
> yes - I uninstalled 545 and went back to 535 without any issues.
> 
> But to recover, reminding folks you can get back to working by using #
> XFdrake and pointing back to Nouveau to get graphics back.

Or Xorg modesetting, which works better sometimes.

Or switch back to previous kernel, remove the latest kernel, then install the latest kernel again, because then it seem to work.
Comment 28 katnatek 2024-01-24 01:52:46 CET
Status of this bug
Comment 29 Morgan Leijström 2024-01-24 08:34:55 CET
nvidia-newfeature driver is not in mga9 release media, so when we release it to updates, users wanting to use it will probably use drakx11 to switch to it.

But it sometimes fail configuring everything correctly for the running kernel, bug 32579.

So that bug IMO should be resolved first.

And then, a later nvidia-newfeature built.


But that said, maybe we are too few testing it.
TJ, you wrote before you have a nvidia now, could you also test bug 32579?

CC: (none) => andrewsfarm

Comment 30 Thomas Andrews 2024-01-24 15:15:31 CET
I'll see what I can do, but there are limitations:

I won't risk my main production install on it. That one is happily running nvidia-current, and I need for it to remain happy. Farm business to take care of, and all that.

My secondary install on this hardware was created not long ago, a clean install using the netinstall ISO, and has but one kernel installed, 6.5.13-6, so I do not have the option to boot into another kernel if the switch fails. It too is happily running nvidia-current, and at the moment, nothing else has been used.
Comment 31 Thomas Andrews 2024-01-24 16:42:33 CET
Complete failure, trashed the system. Fortunately, the production install remains intact. See bug 32579 for details.
Comment 32 Len Lawrence 2024-01-24 21:28:45 CET
OK.  The list which works for me is copied from comment 4 on bug 32579.

dkms-nvidia-newfeature-545.29.06-1.mga9.nonfree.x86_64.rpm
nvidia-cuda-toolkit-12.2.2-1.mga9.nonfree.x86_64.rpm
nvidia-newfeature-doc-html-545.29.06-1.mga9.nonfree.x86_64.rpm
nvidia-newfeature-utils-545.29.06-1.mga9.nonfree.x86_64.rpm
x11-driver-video-nvidia-newfeature-545.29.06-1.mga9.nonfree.x86_64.rpm

MageiaUpdate however cannot deal with it.  It comes up blank after qarepo.  I had to resort to forcing installation by using urpmi in the qarepo directory.
It all worked well after that, running dkms and dracut.
There was a recommendation to use drakx11 to select the X11 driver but that does not see the newfeature driver so you end up back where you started, with nvidia-current.  So the whole procedure had to be repeated.  The first stanza in urpmi.cfg points, correctly, to the local repository.

Logged out and rebooted without issues.
glmark2 shows that NVIDIA 545.29.06 is being used.

$ inxi -G
  Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] driver: nvidia v: 545.29.06
  Display: x11 server: X.org v: 1.21.1.8 with: Xwayland v: 22.1.9 driver: X:
    loaded: modesetting,nvidia,v4l gpu: nvidia resolution: 2560x1440~60Hz
  API: OpenGL v: 4.6.0 NVIDIA 545.29.06 renderer: NVIDIA GeForce GTX 1080

No problems on the Mate desktop.
I shall leave this for others to judge if it is OK.  It looks good to me.

CC: (none) => tarazed25

Comment 33 Morgan Leijström 2024-01-24 22:09:58 CET
Good you tested it :)

However, see Comment 29 why I dont think we should ship this version and not before bug 32579 is sorted.

Additionally, as Giuseppe said kernel 6.6.14 is due in a few days i think we should test all nvidia drivers with it before releasing them. Maybe mesa too, and definitely VirtualBox.
Comment 34 Len Lawrence 2024-01-25 16:19:18 CET
Additional testing:
Started again with nvidia-current and installed the newfeature driver using Morgan's list as detailed in comment 32.  Used drakx11 to select a new driver (wrong before - it does list the newfeature driver).  Everything ran smoothly, with a kernel mod rebuild at reboot.  The new driver was installed and sddm accepted the login with the desktop kernel.
$ glmark2 -b refract
=======================================================
    glmark2 2023.01
=======================================================
    OpenGL Information
    GL_VENDOR:      NVIDIA Corporation
    GL_RENDERER:    NVIDIA GeForce GTX 1080 Ti/PCIe/SSE2
    GL_VERSION:     4.6.0 NVIDIA 545.29.06
    Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
    Surface Size:   800x600 windowed
=======================================================
[refract] <default>: FPS: 5738 FrameTime: 0.174 ms
=======================================================
                                  glmark2 Score: 5737 
=======================================================

$ grep -i nvidia /etc/X11/xorg.conf
    BoardName "NVIDIA Driver: New Feature"
    Driver "nvidia"
$ sudo lsmod | grep nouveau
$ inxi -G
Graphics:
  Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] driver: nvidia v: 545.29.06
  Display: x11 server: X.org v: 1.21.1.8 with: Xwayland v: 22.1.9 driver: X:
    loaded: modesetting,nvidia,v4l gpu: nvidia resolution: 2560x1440~60Hz
  API: OpenGL v: 4.6.0 NVIDIA 545.29.06 renderer: NVIDIA GeForce GTX 1080

Tried the linus kernel with good results.
Switched to the 6.5.13-server-6.mga9 kernel, saw the newfeature driver being built and installed.  Straight to the desktop; glmark2 => 5803.  Desktop running normally.

This bug has been processed in parallel with bug 32579.  It seems to validate the driver for this particular machine, if all the steps for 32579 are correct.
Comment 35 Morgan Leijström 2024-01-25 17:51:42 CET
(In reply to Len Lawrence from comment #34)
> Started again with nvidia-current and installed the newfeature driver using
> Morgan's list as detailed in comment 32.

For a complete test incl drakx11 - Provided nonfree updates_testing is enabled, you should not need to do anything else than directly go to drakx11 and select the other driver, and it should do all its magic.
Comment 36 Thomas Andrews 2024-01-25 18:10:28 CET
After several tests, it is my belief that the situation that triggers bug 32579 would be eliminated for most users if we just push this bug. See bug 32579 comment 20. The users that might still be affected would be those who do not have the nonfree repos enabled. 

@Morgan: drakx11 will "do its magic" on the new feature driver, but in order for it to do so the new feature rpms have to be available in an enabled repo.
Comment 37 Giuseppe Ghibò 2024-01-25 18:48:23 CET
Apparently with this is going unconfigured sometimes during the swap from newfeature <-> current with higher chances when the machine is faster.

I've an idea, see with upcoming nvidia-newfeature-545.29.06-2.mga9.
Comment 38 Marja Van Waes 2024-01-25 21:25:12 CET
Advisory in SVN updated for nvidia-newfeature-545.29.06-2.mga9.nonfree

Added "Additionally, it avoids using specific (sub)module names" to the description, because that was in the log message when the package was pushed to testing.
Marja Van Waes 2024-01-25 21:26:32 CET

Summary: Update request: nvidia-newfeature-545.29.06-1.mga9.nonfree => Update request: nvidia-newfeature-545.29.06-2.mga9.nonfree

Comment 39 Thomas Andrews 2024-01-25 21:40:41 CET
Changing the priority to high .

Severity: normal => critical
Priority: Normal => High

Comment 40 Marja Van Waes 2024-01-25 21:49:28 CET
New list of RPMs:

dkms-nvidia-newfeature-545.29.06-2.mga9.nonfree.x86_64.rpm
nvidia-newfeature-all-545.29.06-2.mga9.nonfree.x86_64.rpm
nvidia-newfeature-cuda-opencl-545.29.06-2.mga9.nonfree.x86_64.rpm
nvidia-newfeature-devel-545.29.06-2.mga9.nonfree.x86_64.rpm
nvidia-newfeature-doc-html-545.29.06-2.mga9.nonfree.x86_64.rpm
nvidia-newfeature-lib32-545.29.06-2.mga9.nonfree.x86_64.rpm
nvidia-newfeature-utils-545.29.06-2.mga9.nonfree.x86_64.rpm
x11-driver-video-nvidia-newfeature-545.29.06-2.mga9.nonfree.x86_64.rpm
Marja Van Waes 2024-01-25 21:50:14 CET

Status comment: (none) => RPMs in comment 40

Comment 41 Len Lawrence 2024-01-26 02:57:25 CET
Installed these without issue.
Selected 'newfeature' with drakx11.
Rebooted to the server, desktop and linus kernels: autorebuild kicked in each time to provide the driver module.  Each flavour was tested in Mate but linus also in Plasma.
Put newfeature through its paces on each login, concentrating on graphical applications like glmark2, vlc and Firefox.  Falkon also worked.
In gwenview the thumbnails for a large collection of scenic jpegs of 4 megapixels or more were generated very quickly.  The same applied to ristretto.
In Plasma glmark2 returned a score of 31999 on the full set, close to twice the maximum seen with previous drivers so the impression given is that newfeature is much faster in general.
VirtualBox performed well.  Launched two Mageia guests on the same window, 32-bit and 64-bit.  Both responded normally.  Resizing by dragging a corner stretches the virtual desktop within a second.
Back on the real desktop - watched TV using vlc.
Rebooted to the desktop kernel and left the system running Mate to see if any graphics issues crop up.
So far so good.
Comment 42 Len Lawrence 2024-01-26 03:01:48 CET
Umm.  Addendum to comment 41:
inxi -b output.

Kernel: 6.5.13-2.mga9 arch: x86_64
10-core Intel Core i9-7900X
NVIDIA GP102 [GeForce GTX 1080 Ti] driver: nvidia v: 545.29.06x11 server: X.org v: 1.21.1.8 with: Xwayland v: 22.1.9 driver: X:
loaded: modesetting,nvidia,v4l gpu: nvidia resolution: 2560x1440~60Hz
Comment 43 Thomas Andrews 2024-01-26 17:36:18 CET
Asus Prime Q270M-C motherboard, 48GB RAM, i5-7500, Quadro K620 graphics, Acer 1920x1080 monitor using the DVI cable, MGA9-64 Plasma. Not as powerful as Len's system, but perfectly functional. (I like it, anyway.)

Tested with the 6.5.13-6 desktop and server kernels.

My test install was all ready for trying the nvidia-current driver update, so I did that one first and ran glmark2 to get a base line for each kernel. Scores were 4494 for server, and 4630 for desktop. I don't have much experience with glmark2 performance on this hardware, but it looks respectable to me.

I used qarepo to download the rpms, then used drakx11 to switch to the new feature driver from within the desktop kernel. There were no installation issues, and reboots into both desktop and server kernel were normal. Glmark2 scores for each were 5175 for server, and 5206 for desktop. An improvement over nvidia-current, but not as dramatic as Len saw. Probably to be expected, given the differences in hardware.

I also tried Firefox, as well as playing a video in vlc, with no issues. I do not as yet have VirtualBox installed on this system.

Looks OK on this hardware.
Comment 44 Morgan Leijström 2024-01-26 20:41:47 CET
Testing now with kernel-linus-6.5.13-2.mga9.x86_64, and mesa and X11 testing updates.

* OK *
Switched to it from nvidia-current using only drakx11, booted same kernel.
Plasma X11, various desktop apps, video, MSW7 guest in VirtualBox 7.0.14.


* Fail *
BOINC usually detect CUDA and OpenCL after manually installing the needed package, in this case
 nvidia-newfeature-cuda-opencl-545.29.06-2.mga9.nonfree.x86_64
and rebooting, but that do not work.
It works for nvidia-current 535.154.05 using last desktop and linus kernels.


$ rpm -qa | grep nvidia | sort
dkms-nvidia-newfeature-545.29.06-2.mga9.nonfree
lib64nvidia-egl-wayland1-1.1.11-1.mga9
nvidia-newfeature-cuda-opencl-545.29.06-2.mga9.nonfree
nvidia-newfeature-doc-html-545.29.06-2.mga9.nonfree
nvidia-newfeature-utils-545.29.06-2.mga9.nonfree
x11-driver-video-nvidia-newfeature-545.29.06-2.mga9.nonfree

$ inxi -G
Graphics:
  Device-1: NVIDIA GM107 [GeForce GTX 750] driver: nvidia v: 545.29.06
  Display: x11 server: X.org v: 1.21.1.8 with: Xwayland v: 22.1.9 driver: X:
    loaded: modesetting,nvidia,v4l gpu: nvidia resolution: 3840x2160~60Hz
  API: OpenGL v: 4.6.0 NVIDIA 545.29.06 renderer: NVIDIA GeForce GTX
    750/PCIe/SSE2

Keywords: (none) => feedback

Comment 45 Morgan Leijström 2024-01-29 13:32:48 CET
@Giuseppe, does this need a CUDA update?

newfeature 545 fail, but current 535 (and IIRC 470) works

Status comment: RPMs in comment 40 => RPMs in #40, CUDA problem in #44

Comment 46 Morgan Leijström 2024-01-29 13:36:01 CET
As we have not before shipped any nvidia newfeature with mga9,
this can not be critical.

But very good to provide to our users for new GPUs of course.

Severity: critical => major

Comment 47 Giuseppe Ghibò 2024-01-29 14:00:47 CET
For cuda compatibility in theory shouldn't, as the "minimum requirement compat version" for cuda 12.3, according to this table:

https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-toolkit-major-component-versions

is 525.60. Minimum intented as basic components. On the other hand cuda 12.3 ships with 545.23.08 so in theory it could maybe require some components of the upcoming 12.3.

You might check either with cuda-z as well as with any of the nvidia-cuda-toolkit-samples-bins. Once installed nvidia-cuda-toolkit-samples-bins, into:

/usr/share/nvidia-cuda-toolkit/samples/bin/x86_64/linux/release

there are hundreds of samples that runs with cuda (note that some of them might fail on their own because require a writable workdir).
Comment 48 Giuseppe Ghibò 2024-01-29 14:16:55 CET
btw, cuda-12.3 RPM is not yet ready however
Comment 49 Morgan Leijström 2024-01-29 15:22:59 CET
cuda-z say it is working.

Sidenote: it required nvidia-cuda-toolkit, in total 1,6 GB. Monster.
nvidia-cuda-toolkit-samples and nvidia-cuda-toolkit-samples-bins are both also gigabyte+ monsters, no space...

BOINC still can not find CUDA, pointing my finger to it, bug later.

Status comment: RPMs in #40, CUDA problem in #44 => RPMs in #40

Comment 50 Morgan Leijström 2024-01-29 20:26:38 CET
Opened: Bug 32788 - BOINC do not find GPU using nvidia-newfeature-cuda-opencl-545.29.06-2, works for 535
Comment 51 Morgan Leijström 2024-03-05 14:43:15 CET
I close this because nvidia-current now in testing is a higher number version.

Bug 32933 - Update request: nvidia-current-550.54.14-2.mga9.nonfree

Resolution: (none) => OLD
Status: NEW => RESOLVED

Comment 52 Thomas Andrews 2024-03-05 15:51:12 CET
This should not be closed, it should be pushed. It does work for many, if not most, and not having it available to our users triggers a serious bug.

Under present conditions, there are no nvidia-newfeature driver rpms outside of the testing repos, yet drakx11 offers it as an option. If a user innocently decides to try out the newfeature driver, and there is no driver in the repos, drakx11 is not triggered to notify that user that a proprietary driver is available. I don't know what drakx11 does if the user proceeds, but I do know that the result is a system that is unusable. 

I know this because it happened to me. See bug 32579 comment 13.

That is a bug with drakx11, and it should be fixed. but there is no sense in us deliberately setting it up to be triggered when we can avoid it.

I am re-opening this bug, giving it an OK, and validating it, because *not* pushing it continues a much worse situation than pushing it.

Status: RESOLVED => REOPENED
Resolution: OLD => (none)

Thomas Andrews 2024-03-05 15:53:38 CET

Whiteboard: (none) => MGA9-64-OK
Keywords: feedback => validated_update
Depends on: 32579 => (none)
CC: (none) => sysadmin-bugs

Comment 53 Morgan Leijström 2024-03-05 16:28:52 CET
(In reply to Thomas Andrews from comment #52)
> This should not be closed, it should be pushed. It does work for many, if
> not most, and not having it available to our users triggers a serious bug.

Ah, I forgot about that one.

> Under present conditions, there are no nvidia-newfeature driver rpms outside
> of the testing repos, yet drakx11 offers it as an option. If a user
> innocently decides to try out the newfeature driver, and there is no driver
> in the repos, drakx11 is not triggered to notify that user that a
> proprietary driver is available. I don't know what drakx11 does if the user
> proceeds, but I do know that the result is a system that is unusable. 
> 
> I know this because it happened to me. See bug 32579 comment 13.
> 
> That is a bug with drakx11, and it should be fixed.

Right. 

Added it into to Bug 32352 - drakx11 do not for nvidia check whether kernel-devel is installed, nor if nvidia module really got built 
- at Bug 32352 Comment 7


> but there is no sense in
> us deliberately setting it up to be triggered when we can avoid it.
> 
> I am re-opening this bug, giving it an OK, and validating it, because *not*
> pushing it continues a much worse situation than pushing it.

OK.

We should open another bug for getting a newer newfeature version higher than nvidia-current in testing.
Comment 54 katnatek 2024-03-06 00:09:14 CET
Should I add an advisory?
Comment 55 Morgan Leijström 2024-03-06 00:15:36 CET
I think it is needed in order for it to be able to push.

For Mageia 9, nvidia-newfeature is a new package. (not an update)

It could maybe be nice o tell that.
Comment 56 katnatek 2024-03-06 00:24:15 CET
Oh marja do it time ago
Comment 57 Morgan Leijström 2024-03-06 00:26:03 CET
Ah, yes, I too see now :)

And BTW thank you for taking on working with advisories! :)
Comment 58 Mageia Robot 2024-03-06 05:23:43 CET
An update for this issue has been pushed to the Mageia Updates repository.

https://advisories.mageia.org/MGAA-2024-0097.html

Resolution: (none) => FIXED
Status: REOPENED => RESOLVED


Note You need to log in before you can comment on or make changes to this bug.