Created attachment 14369 [details] Failed boot attempt I have 3 computers with Mga9. They have different hardware but I attempt to maintain them in a similar state (read: perform frequent updates at similar timepoints). One of them stopped booting after the last update - from 6.5.13-desktop-6.mga9 to 6.6.14-desktop-2.mga9. There are 3 kernels currently installed on it: 6.5.11-desktop-5.mga9 6.5.13-desktop-6.mga9 6.6.14-desktop-2.mga9 6.6.14-desktop-2.mga9 does not boot (a "boiling cauldron" appears, no reaction apart from Ctrl-Alt-Prtscr-B), the remaining two work flawlessly. After some keystroke combination I could also remark once that ~"failed to start systemd-modules-load.service". The other two computers do not have this problem. [root@pmr etc]# inxi -v1 System: Host: pmr Kernel: 6.5.13-desktop-6.mga9 arch: x86_64 bits: 64 Desktop: KDE Plasma v: 5.27.5 Distro: Mageia 9 CPU: Info: quad core Intel Core i7-4771 [MT MCP] speed (MHz): avg: 1163 min/max: 800/3900 Graphics: Device-1: NVIDIA GK110B [GeForce GTX 780 Ti] driver: nvidia v: 470.223.02 Display: x11 server: X.org v: 1.21.1.8 with: Xwayland v: 22.1.9 driver: X: loaded: nvidia,v4l gpu: nvidia resolution: 1: 1920x1200~60Hz 2: 1200x1920~60Hz API: OpenGL v: 4.6.0 NVIDIA 470.223.02 renderer: NVIDIA GeForce GTX 780 Ti/PCIe/SSE2 Drives: Local Storage: total: 2.05 TiB used: 800.91 GiB (38.1%) Info: Processes: 305 Uptime: 33m Memory: 31.29 GiB used: 3.37 GiB (10.8%) Shell: Bash inxi: 3.3.26
Thank you for creating this bug report. Main issue is dkms-anbox makes a system with kernel 6.6 unable to boot. Errata: Entered a bit down in https://wiki.mageia.org/en/Mageia_9_Errata#Various_software Now linking that to here too. In Forums: https://forums.mageia.org/en/viewtopic.php?t=15248 https://www.mageialinux-online.org/forum/topic-31321-1-kernel-6-6-14.php
Whiteboard: (none) => IN_ERRATA9Status comment: (none) => Follow up... forum post? improve errata?Hardware: x86_64 => AllSeverity: normal => majorPriority: Normal => HighCC: (none) => fri
CC: (none) => saveurlinux
Let's see if we can fix that. dkms-anbox-0.0.3-1.4.mga9.noarch.rpm landing in update_testings, please let us know if the kernel panic is still there. Cheers, Chris.
CC: (none) => eatdirt
(In reply to Chris Denice from comment #2) > Let's see if we can fix that. > > dkms-anbox-0.0.3-1.4.mga9.noarch.rpm > > landing in update_testings, please let us know if the kernel panic is still > there. > > Cheers, > Chris. Tested in real hardware magei 9 x86_64 Before the test I reinstall kernel-server and kernel-server-devel 6.6.14 Install this version of dkms-anbox kernel desktop 6.6.24 still not boot with this module present, I let kernel-server just in case you need other test
Ok, we should conflict it with kernels >=6.6 then. I need to have a look to our kernel versioning system, which is completely bugged these days...
(In reply to Chris Denice from comment #4) > Ok, we should conflict it with kernels >=6.6 then. Yeah, should have been done before releasing kernel 6.6... I did suggest it... Bug 32786 comment 105
(In reply to Chris Denice from comment #4) > Ok, we should conflict it with kernels >=6.6 then. > I need to have a look to our kernel versioning system, which is completely > bugged these days... Why not try to make a dkms-ashmen only ? What benefits does it provide to have ashmen module? I will try if that works and let you now
CC: (none) => j.alberto.vc
Ok I get some sucess, requires a test that reject to install in kernels with native binder but is beyond my abilities
Created attachment 14397 [details] Changes from current spec in mageia 9 Splits in 3 packages dkms-anbox for common files dkms-anbox-binder + README.install.urpmi that warn about use in kernels with native binder, this is the part that need check native binder because install in working systems also produce some issues dkms-anbox-ashmem for the ashmem module
Created attachment 14398 [details] README.install.urpmi Warn about use with kernels with native binder
Thanks Kanatec for that work, but to be honest, I think we should just conflict it. I've just done that in dkms-anbox-0.0.3-1.5.mga9.noarch.rpm landing soon in updates_testing. You won't be able to install it with any kernels >=6.6.0-1. And, we should drop it in Cauldron. It was only used for waydroid.
(In reply to Chris Denice from comment #10) > Thanks Kanatec for that work, but to be honest, I think we should just > conflict it. > > I've just done that in dkms-anbox-0.0.3-1.5.mga9.noarch.rpm landing soon in > updates_testing. You won't be able to install it with any kernels >=6.6.0-1. Let me know when is ready to test
Testing urpmi dkms-anbox-0.0.3-1.5.mga9: DANGEROUS It asks if it should remove all 6.6 kernels This system only have 6.6 kernels installed - so all... I said yes with the idea urpmi had some protection, and this must be tested. "Interesting": It build dkms-anbox for the running 6.6.17 kernel uninstall all kernels (only had 6.6 types) - including the running one!! Tried to urpmi kernel-desktop-6.6.17-1, but it directly denied without asking saying it conflicts dkms-anbox. So urpmi can sometimes simply refues, and sometines ask. Can you adjust the package to make urpmi and drakrpm also stubbornly refuse to install dkms-anbox when a kernel 6.6+ is installed? NOT as user. So user have to first choose to uninstall kernels, if he really want anbox.
dkms-anbox should not have been installed with a kernel 6.6.17. That's a bug indeed: Conflicts: kernel-desktop >= 6.6.0-1 Conflicts: kernel-desktop586 >= 6.6.0-1 Conflicts: kernel-server >= 6.6.0-1 Conflicts: kernel-linus >= 6.6.0-1 What is the package name of that kernel of yours, maybe I missed one ?
Running kernel was desktop-6.6.17-1.mga9 from updates_testing A whole slew of 6.6 kernels was installed; the four last -desktop from testing repo, and the last desktop and linus from released updates repo. ALL 6.6 got uninstalled. Yes it asked before and i as QA was playing stupid user and said yes. Optimally it shoul dnot ask to remove a kernel, just refuse. -like it do not ask, just refuse to install a 6.x kernel if dkms-anbox is installed. Besides, I think a running kernel should *never* be uninstalled, regardless of any rpm dependency fault, so probably additionally bug in urpmi? --- Minor correction: i now see system had and still have the last 6.5 kernel too, it was not uninstalled.
CC: (none) => ghibomgx
In my opinion this is being handled backwards, the kernel spec in the one that mus conflict with dkms-anboxnot dkms-anbox with the kernel @Giuseppe can you implement the conflict with dkms-anbox for kernels with native binder @Chris Denice, can you remove the conflict?
Have you tried kernel-desktop-6.6.17-1.mga9? It no longer has the internal binder support, so shouldn't conflict with external dkms-anbox.
(In reply to Giuseppe Ghibò from comment #16) > Have you tried kernel-desktop-6.6.17-1.mga9? It no longer has the internal > binder support, so shouldn't conflict with external dkms-anbox. That could be interesting to see if works with current waydroid in testing I provide feedback later
What!!!?? Guiseppe, we need to have binderfs built-in from now on, all the waydroid bugs have been fixed in this way. Please restore it for all kernel version >= 6.6.0. But indeed, what would be better is not a conflict but: Obsoletes: dkms-anbox in *all kernel versions* having binderfs switched on!
I insist: it is far better to have native support that a bloody unmaintained module doing the same thing, dkms-anbox is really obsoletes. In fact, we should have seen that for mga9 and dropped it already a while ago!
That way would require a Obsoletes + Provides: dkms-anbox. The only reason why I disabled is that I thought that the dkms-anbox out of tree was newer and better supported than the one in stock kernel, so to allow the latest version. So in definitive, is this one broken/unmaintained? https://github.com/choff/anbox-modules or a better version than the stock one?
Yes, the dkms-anbox we have is that one, from choff, it is a forked of the official one which is abandoned since a while. But, we're done with that, waydroid works fine with binderfs native, and I've fixed it in this way (currently in update_testings) https://bugs.mageia.org/show_bug.cgi?id=32467 PS: Indeed, Obsoletes & Provides, but after that update to waydroid, nothing will be left that Requires dkms-anbox, so a Obsolete will also do. I have already dropped the package from Cauldron as well.
Oh, I did not yet really answer to your question. That fork: https://github.com/choff/anbox-modules seems to be people fighting to still be able to have anbox working when kernels provided by distro are not built with binderfs. And that explains why we get kernel panics with dkms-anbox with binderfs built in, they're never confronted to that situation. I am not sure why they do that actually.
Ok, will be re-enabled in the next build. BTW, is there a minimal quick example on waydroid to start it to test (e.g. a with a prebuilt downloadable image)?
(In reply to Giuseppe Ghibò from comment #23) > Ok, will be re-enabled in the next build. BTW, is there a minimal quick > example on waydroid to start it to test (e.g. a with a prebuilt downloadable > image)? To add more info the current waydroid in testing not works with dkms-anbox, I rebuild the actual commit without the kernel conflicts and not works even in current kernel 6.6.14 server binder dies like if you boot without the psi=1 cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-6.6.14-server-2.mga9 root=UUID=a0cc43c0-b94e-44c7-8ca9-0a69cb6f7053 ro splash quiet noiswmd resume=UUID=ac50cb2a-7731-479b-94f1-e90cc4f90106 audit=0 vga=791 psi=1 waydroid show-full-ui [17:45:48] Starting waydroid session [gbinder] Service manager /dev/binder has appeared [gbinder] WARNING: Service manager /dev/binder has died The kernel-desktop-6.6.17-1.mga9 not panic but is not a good thing for waydroid in testing
waydroid seems requireing wayland...no for Xorg? 6.6.14-desktop-2.mga9 has binderfs native, so it should work with latest published waydroid. 6.6.14-server-2.mga9 hasn't so it would require dkms-anbox, which has been dropped/removed.
ok, apparently running weston, and then waydroid on top of it could replace Xwayland requirement. But then gives other errors about lxc...
(In reply to Giuseppe Ghibò from comment #25) > waydroid seems requireing wayland...no for Xorg? > Yes As I already have plasma I install time ago plasma-worskpace-wayland to start with the test of waydroi > 6.6.14-desktop-2.mga9 has binderfs native, so it should work with latest > published waydroid. 6.6.14-server-2.mga9 hasn't so it would require > dkms-anbox, which has been dropped/removed. Yes it does, I test before in the waydroid bug ;)
(In reply to Giuseppe Ghibò from comment #26) > ok, apparently running weston, and then waydroid on top of it could replace > Xwayland requirement. But then gives other errors about lxc... Waydroid requires edit a configuration file generated, see https://bugs.mageia.org/show_bug.cgi?id=32467#c61
Guiseppe, check more carefully the procedure in bugs https://bugs.mageia.org/show_bug.cgi?id=32467#c61 and, if you make it work, please validate it. This update has been waiting for ages because it is a bit hardcore for QA. Let's keep the thread here about just fixing and/or preventing the kernel-panic when dkms-anbox is loaded, and I think the most efficient way is Obsoletes(/Provides) in kernel packages built with binderfs on (if you agree).
(In reply to Morgan Leijström from comment #14) > > Besides, I think a running kernel should *never* be uninstalled, regardless > of any rpm dependency fault, so probably additionally bug in urpmi? > https://bugs.mageia.org/show_bug.cgi?id=31015
CC: (none) => zen25000
Ah, Thank you Berry, I thought I dreamt as I could not find it. Now added "urpme/urpmi" in summary so I find it next time...
(In reply to Chris Denice from comment #29) > Guiseppe, check more carefully the procedure in bugs > I think could add such infos in some wiki > > and, if you make it work, please validate it. This update has been waiting > for ages because it is a bit hardcore for QA. > > Let's keep the thread here about just fixing and/or preventing the > kernel-panic when dkms-anbox is loaded, and I think the most efficient way > is Obsoletes(/Provides) in kernel packages built with binderfs on (if you > agree). Ok, testing with a new build 6.6.17-3.mga9 (actually in build queue), I was able to start waydroid. Instead of Xwayland I started within 'weston' under Xorg: just run 'weston &' from cli, start a terminal from weston and run 'waydroid show-full-ui' from there. It started properly. A little bit difficult to get the networking: a quick workaround is to disable shorewall, though problably requires some extra setup. Also adding just 'net waydroid0 detect' to /etc/shorewall/interfaces hasn't helped. One thing I noticed within the waydroid emulation is that the system .mp4 playback doesn't work, audio works but playback shows only the first frame. VLC instead works. What I also haven't yet found is a way, once the emulated OS it is started, to poweroff the emulation cleanly. Anyway these are specific problems of the emulator and not related to this bug. For the Obsoletes, instead of Obsoletes: dkms-anbox, I used Conflicts: dkms-anbox < 0.0.3-5 (also because kernel is a multipackage and there might be kernels already installed without the builtin anbox). At this point the "Obsoletes: dkms-anbox" should be placed in mga9's task-obsolete, like you did for cauldron. In kernel 6.6.17-3.mga9 I borrowed an extra patch to build the android binder as kernel module (this has some advantage, security for -server, avoid panics, etc). Of course the binder_linux.ko kernel module should be loaded before running waydroid, e.g. adding by an entry in /etc/modprobe.d/.
Ok, then I'll remove the conflicts in the dkms-anbox packages!
(In reply to Chris Denice from comment #33) > Ok, then I'll remove the conflicts in the dkms-anbox packages! Giussepe conflicts with a rel and not a subrel, you need to use %mkrel 6 and remove subrel if the intention is the package can be installed
Tested in real hardware mageia 9 x86_64 Install kernel desktop 6.6.17-3 and the needed devel , installl a local build of dkms-anbox that not conflicts with the kernel, as I was running other kernel I reboot twice in kernel 6.6.17 one without psi=1 and the other with psi=1 lsmod|grep linux ashmem_linux 20480 0 binder_linux 241664 102 uname -r 6.6.17-desktop-3.mga9 Picture taked from my phone because the screenshot tool not works in wayland https://www.imagebam.com/view/MES4Y34
Works also without dkms-anbox LC_ALL=C rpm -q dkms-anbox package dkms-anbox is not installed https://www.imagebam.com/view/MES4YMS The drawback I see is now waydroid need a little adjust to work without dkms-anbox as point Giuseppe in comment#32
(In reply to katnatek from comment #36) > Works also without dkms-anbox > > LC_ALL=C rpm -q dkms-anbox > package dkms-anbox is not installed > > https://www.imagebam.com/view/MES4YMS > > The drawback I see is now waydroid need a little adjust to work without > dkms-anbox as point Giuseppe in comment#32 AFAIK those were the same commands/drawback for loading the binder_linux module required before, when the dkms-anbox package it's installed. The difference with respect to the dkms-anbox is also that here there is not the custom ashmem_linux module. AFAIK that was not integrated in the upstream kernel. Regarding the dkms-anbox conflicts version, it's not used the subrel. I used dkms-anbox < 0.0.3-5 as Conflicts limit, which comprises the latest dkms-anbox version we ever released in cauldron too (i.e. dkms-anbox-0.0.3-4.mga10) before retiring to obsolete, though the latest mga9's dkms-anbox version was dkms-0.0.3-1.5.mga9. In this way we'll drop dkms-0.0.3-1.5.mga9 from distro, but it does not preclude the possibility to use future newer custom local dkms-anbox packages >= 0.0.3-5 (this is mostly what katnatek did in comment #35, I guess you used a custo dkms-anbox-0.0.3-6.mga9) should they include some new feature (e.g. to test) not included in stock/upstream kernels. What remain is to add Obsoletes: dkms-anbox in mga9's task-obsolete. What is not yet clear here is that whether the psi=1 is mandatory or not for running waydroid. Last but not least, does anbox works the same as waydroid? Still maintained upstream/working or was replaced by waydroid?
Anbox is obsoleted by waydroid, the psi=1 is indeed required for waydroid, and the module load should be automatic with my push of waydroid updates_testing, but, again, guys, this should be on the waydroid bug thread :) Giuseppe, I don't think we can put dkms-anbox on task-obsoletes for mga9, because people may want to keep kernels 6.5 and could still be needing dkms-anbox for running waydroid. That's why, for mga9, just the Conflicts would do fine I think.
And waydroid do not need the ashmem module anymore, that's why the native binderfs support within kernel is enough.
Then it should be already fine as currently it is.
(In reply to Giuseppe Ghibò from comment #37) > (In reply to katnatek from comment #36) > > > Works also without dkms-anbox > > > > LC_ALL=C rpm -q dkms-anbox > > package dkms-anbox is not installed > > > > https://www.imagebam.com/view/MES4YMS > > > > The drawback I see is now waydroid need a little adjust to work without > > dkms-anbox as point Giuseppe in comment#32 > > AFAIK those were the same commands/drawback for loading the binder_linux > module required before, when the dkms-anbox package it's installed. The > difference with respect to the dkms-anbox is also that here there is not the > custom ashmem_linux module. AFAIK that was not integrated in the upstream > kernel. > That was fixed with current dkms-anbox and was not needed for waydroid in testing without dkms-anbox and kernel desktop 6.6.14 , but I think is a minor issue adjust waydroid in testing to include the required file to load native binder_linux if whe get rid of the issues of have dkms-anbox and native binder > Regarding the dkms-anbox conflicts version, it's not used the subrel. I used > dkms-anbox < 0.0.3-5 as Conflicts limit, which comprises the latest > dkms-anbox version we ever released in cauldron too (i.e. > dkms-anbox-0.0.3-4.mga10) before retiring to obsolete, though the latest > mga9's dkms-anbox version was dkms-0.0.3-1.5.mga9. > > In this way we'll drop dkms-0.0.3-1.5.mga9 from distro, but it does not > preclude the possibility to use future newer custom local dkms-anbox > packages >= 0.0.3-5 (this is mostly what katnatek did in comment #35, I > guess you used a custo dkms-anbox-0.0.3-6.mga9) should they include some new > feature (e.g. to test) not included in stock/upstream kernels. > AFIK not include nothing new I use the spec from Chris Denice removing the conflicts in the spec Good work Giuseppe, I need to make one additional test to see if I can reproduce a thing that happen to me and add to the Errata when 6.6.17-3 arrive to updates (In reply to Chris Denice from comment #38) > Anbox is obsoleted by waydroid, the psi=1 is indeed required for waydroid, > and the module load should be automatic with my push of waydroid > updates_testing, but, again, guys, this should be on the waydroid bug thread > :) I think is needed here also because are test about the issue of dkms-anbox blocking boot of kernel 6.6 series, my test probe that the work of Giuseppe helps to fix that nasty thing because I was able to boot kernel desktop 6.6.17-3 with dkms-anbox installed
Ok, issue not reproduced, but I keep watching/trying to reproduce Giuseppe Is fine I open report about update to kernel 6.6.17, or you are still tuning it?
Thank you all, there is another emulator working to the emulators galore... For kernel, there is 6.6.17-4 in the build (minor fixes with respect to 6.6.17-3). Probably worthwhile to wait for next 6.6.18-1 (I guess during weekend) for the bug opening for update.
I'll test that, hopefully the file system mounting should trigger the module load.
(In reply to Giuseppe Ghibò from comment #43) > Thank you all, there is another emulator working to the emulators galore... > > For kernel, there is 6.6.17-4 in the build (minor fixes with respect to > 6.6.17-3). Probably worthwhile to wait for next 6.6.18-1 (I guess during > weekend) for the bug opening for update. As we also have the lxc bug, wait more time not hurts ;)
Guiseppe, I am testing kernel 6.6.18 with waydroid, seems to be ok. However, the name of the devices have been changed compared to 6.6.14 on which I've fixed waydroid. ls /dev/binderfs/ anbox-binder anbox-hwbinder anbox-vndbinder binder-control features/ Now, they are named anbox-*, before the anbox prefix was not there. Is this an upstream change in kernel? Have you modified the name of these devices? Are you importing external anbox modules within our kernel? In this last case, please don't. The standard kernel binderfs provided works fine!
I've not changed them during last build with respect to 6.6.17-4, which we tested, and in fact here I see still device names without the anbox* prefix. $ uname -r 6.6.18-desktop-1.mga9 $ zcat /proc/config.gz | grep CONFIG_ANDROID_BINDER_DEVICES CONFIG_ANDROID_BINDER_DEVICES="binder,hwbinder,vndbinder" $ ls /dev/binderfs binder binder-control features/ hwbinder vndbinder Are you sure the binder modules you are seeing doesn't come from some other package which names the devices differently? modinfo binder_linux what shows?
All right, I'll investigate a bit more, I might have some funny relics possibly, but the binder module loaded is really from the kernel; so may be that's an upstream kernel change: modinfo binder_linux filename: /lib/modules/6.6.18-desktop-1.mga9/kernel/drivers/android/binder_linux.ko.xz license: GPL v2 depends: retpoline: Y intree: Y name: binder_linux vermagic: 6.6.18-desktop-1.mga9 SMP preempt mod_unload parm: alloc_debug_mask:uint parm: debug_mask:uint parm: devices:charp
Hmmm, at this point I'm not even sure modinfo would show the same path used by modprobe (one have to "strace modprobe" to see which files is loaded effectively to be sure). Unless there is some other commands that generates the mknod nodes under some other name.
A lof of info in strace, but it seems to be genuine, I have also checked that I have no other funny modprobe config. I have grep anbox over /etc and /usr/lib/systemd, all good. I'll fix waydroid then, that is trivial to change.
psi=1 is there? A working out of the box solution would be to create both the sets (or using softlinks).
yep, psi=1. It is already softlinks, so that would do. But, since we're still in update_testings for waydroid, let's say we target >= 6.6.18. I've pushed an a new version to update_testings with sym links to anbox-*. We should be good.
OK, now I get /dev/binderfs/anbox-{binder,hwbinder,vndbinder} too and without any /dev/binderfs/{binder,hwbinder,vndbinder}. And there is /dev/{binder,hwbindir,vndbinder} softlinks to /dev/binderfs/anbox-{binder,hwbinder,vndbinder} Can't remember previous permission but now /dev/binderfs/anbox-*binder is world writable.
Test related to this bug Mageia 9 x86_64 uname -a Linux phoenix 6.6.18-desktop-1.mga9 #1 SMP PREEMPT_DYNAMIC Sat Feb 24 02:17:35 UTC 2024 x86_64 GNU/Linux rpm -q dkms-anbox dkms-anbox-0.0.3-6.mga9 Reboot twice without panics
(In reply to Giuseppe Ghibò from comment #32) > (In reply to Chris Denice from comment #29) > > Guiseppe, check more carefully the procedure in bugs > > > > I think could add such infos in some wiki > > > > > and, if you make it work, please validate it. This update has been waiting > > for ages because it is a bit hardcore for QA. > > > > Let's keep the thread here about just fixing and/or preventing the > > kernel-panic when dkms-anbox is loaded, and I think the most efficient way > > is Obsoletes(/Provides) in kernel packages built with binderfs on (if you > > agree). > > Ok, testing with a new build 6.6.17-3.mga9 (actually in build queue), I was > able to start waydroid. Instead of Xwayland I started within 'weston' under > Xorg: just run 'weston &' from cli, start a terminal from weston and run > 'waydroid show-full-ui' from there. It started properly. A little bit > difficult to get the networking: a quick workaround is to disable shorewall, > though problably requires some extra setup. Also adding just 'net waydroid0 > detect' to /etc/shorewall/interfaces hasn't helped. > Perhaps this https://wiki.archlinux.org/title/Waydroid#Network helps to you, please let us now once we create a wiki page for waydroid or in the waydroid bug
Depends on: (none) => 32923, 32922
Depends on: 32922 => (none)
Now that the set of kernels 6.6.18 is published and waydroid not requires dkms-anbox , this bug can be closed
Status: NEW => RESOLVEDResolution: (none) => FIXED