Bug 25072 - urpmi trap divide / general protection errors in libdb-5.3.so during online upgrade mga6 -> 7
Summary: urpmi trap divide / general protection errors in libdb-5.3.so during online ...
Status: NEW
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: High major
Target Milestone: Mageia 9
Assignee: Mageia tools maintainers
QA Contact:
URL:
Whiteboard: MGA8TOO
Keywords: IN_RELEASENOTES8
: 25531 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-07-08 11:42 CEST by papoteur
Modified: 2022-12-13 18:25 CET (History)
14 users (show)

See Also:
Source RPM: urpmi, dnf
CVE:
Status comment: implement proper locking between urpmi and dnf


Attachments
Journal during the upgrade (226.90 KB, application/gzip)
2019-07-13 20:28 CEST, papoteur
Details

Description papoteur 2019-07-08 11:42:08 CEST
Description of problem:
My upgrade from M6 to M7 has been very difficult.
It failed more than one time because I get:
- a core dump. I had to rebuild the rpm database to overcome this step.
- no more space available. I had to uninstall some packages or urpmi --clean.
- installation failed: some package is necessary to (already installed) ...
I did urpme: 
lib64tcb0-1.1-7.mga6.x86_64
python3-qt5-webenginecore
python3-3.5.7
perl-5.22.3
Each time, the mga6 and mga7 version was installed together.

How reproducible:
urpmi.removemedia -a
urpmi.addmedia --distrib http://ftp.free.fr/mirrors/mageia.org/distrib/7/x86_64/
urpmi --replacefiles --auto-update --auto
...
L'installation a échoué :    python3-qt5-core = 5.6 est nécessaire pour (déjà installé) python3-qt5-webenginecore-5.6-8.mga6.x86_64
...
L'installation a échoué :    lib64python3.5 = 3.5.7-1.mga6 est nécessaire pour (déjà installé) python3-3.5.7-1.mga6.x86_64
...

L'installation a échoué :    libgdbm.so.4()(64bit) est nécessaire pour (déjà installé) perl-2:5.22.3-3.2.mga6.x86_64
Comment 1 Jani Välimaa 2019-07-09 10:02:19 CEST
There's no need to urpme old pkgs by hand, just answer yes when you're informed that some pkgs must be removed to be able to update others.

It's also better to remove --replacefiles option so file conflicts are revealed to user.
Comment 2 papoteur 2019-07-09 12:36:56 CEST
(In reply to Jani Välimaa from comment #1)
> There's no need to urpme old pkgs by hand, just answer yes when you're
> informed that some pkgs must be removed to be able to update others.
There no questions, as installation FAILED (l'installation a échoué).
> 
> It's also better to remove --replacefiles option so file conflicts are
> revealed to user.
Have we to change instructions in release notes[1]?

[1] https://wiki.mageia.org/en/Mageia_7_Release_Notes#Upgrading_online.2C_using_urpmi_.28CLI.29
Comment 3 Jani Välimaa 2019-07-09 13:51:11 CEST
Also either remove --auto or use --force to answer yes to all questions.

CC: (none) => jani.valimaa

Comment 4 Marja Van Waes 2019-07-13 10:33:33 CEST
Hi Yves,

Please attach log.txt that is the result of running, as root:

   journalctl -a --since="YYYY-MM-DD hh:mm" --until="YYYY-MM-DD hh:mm" > log.txt

and adjust the --since time to right before you started to upgrade and the --until time to shortly after you finished upgrading and working around the issues.

Thanks :-)

Keywords: (none) => NEEDINFO
CC: (none) => mageiatools, marja11, pkg-bugs

Comment 5 papoteur 2019-07-13 20:28:19 CEST
Created attachment 11180 [details]
Journal during the upgrade

Hi Marja,
Here is the log.txt
Comment 6 papoteur 2019-07-13 20:51:21 CEST
juil. 07 19:50:51 YZenbook.local [RPM][3022]: erase libreoffice-langpack-fr-1:6.1.5.2-1.2.mga6.x86_64: success
juil. 07 19:50:51 YZenbook.local kernel: traps: urpmi[3022] trap divide error ip:7feb27e6b876 sp:7fff9be23e50 error:0 in libdb-5.3.so (deleted)[7feb27d97000+1b1000]
juil. 07 19:50:51 YZenbook.local dnf[5596]: Cache des métadonnées mis à jour récemment.
juil. 07 19:50:51 YZenbook.local systemd[1]: dnf-makecache.service: Succeeded.
juil. 07 19:50:51 YZenbook.local systemd[1]: Started dnf makecache.
juil. 07 19:53:05 YZenbook.local urpmi[5600]: called with: --replacefiles --auto-update --auto
juil. 07 19:53:07 YZenbook.local kernel: traps: urpmi[5600] general protection ip:7f2c0610e241 sp:7ffddfe96510 error:0 in libdb-5.3.so[7f2c05fb3000+1c2000]

This is the first crash of urpmi.
Then the second when I'm trying to relaunch it.
Trying dnf crashed too.
After that, I deleted rpm db then rebuilt it.
Morgan Leijström 2019-07-14 04:38:26 CEST

Source RPM: (none) => urpmi
Assignee: bugsquad => mageiatools
CC: (none) => fri
Summary: Upgrading by changing sources fails because some packages stays in both mga6 and mga7 version => urpmi trap divide / general protection errors in libdb-5.3.so during online upgrade mga6 -> 7

Comment 7 papoteur 2019-07-25 21:32:39 CEST
What we can see is that dnf.makecache was running at the same second that urpmi crashed.
I think this is not a coincidence.


juil. 07 19:50:51 YZenbook.local systemd[1]: Starting dnf makecache...
juil. 07 19:50:51 YZenbook.local [RPM][3022]: erase systemd-units-230-12.3.mga6.x86_64: success
juil. 07 19:50:51 YZenbook.local [RPM][3022]: erase util-linux-2.28.2-2.1.mga6.x86_64: success
juil. 07 19:50:51 YZenbook.local dbus[5536]: [system] Reloaded configuration
juil. 07 19:50:51 YZenbook.local dnf[5596]: /etc/host.conf : ligne 3 : commande erronée« nospoof on »
juil. 07 19:50:51 YZenbook.local dnf[5596]: /etc/host.conf : ligne 4 : commande erronée« spoofalert on »
juil. 07 19:50:51 YZenbook.local dbus[5536]: [system] Reloaded configuration
juil. 07 19:50:51 YZenbook.local dbus[5536]: [system] Reloaded configuration
juil. 07 19:50:51 YZenbook.local [RPM][3022]: erase systemd-230-12.3.mga6.x86_64: success
juil. 07 19:50:51 YZenbook.local [RPM][3022]: erase libreoffice-pyuno-1:6.1.5.2-1.2.mga6.x86_64: success
juil. 07 19:50:51 YZenbook.local [RPM][3022]: erase libreoffice-gtk3-1:6.1.5.2-1.2.mga6.x86_64: success
juil. 07 19:50:51 YZenbook.local [RPM][3022]: erase libreoffice-langpack-fr-1:6.1.5.2-1.2.mga6.x86_64: success
juil. 07 19:50:51 YZenbook.local kernel: traps: urpmi[3022] trap divide error ip:7feb27e6b876 sp:7fff9be23e50 error:0 in libdb-5.3.so (deleted)[7feb27d97000+1b1000]
juil. 07 19:50:51 YZenbook.local dnf[5596]: Cache des métadonnées mis à jour récemment.
juil. 07 19:50:51 YZenbook.local systemd[1]: dnf-makecache.service: Succeeded.
juil. 07 19:50:51 YZenbook.local systemd[1]: Started dnf makecache.

Keywords: NEEDINFO => (none)

papoteur 2019-07-25 21:58:10 CEST

Priority: Normal => High
See Also: (none) => https://bugs.mageia.org/show_bug.cgi?id=22487

Comment 8 Neal Gompa 2019-07-25 22:04:38 CEST
This is definitely a coincidence. DNF's makecache service has zero interaction with BDB in this manner. All it does is download metadata and process it to generate the cache.

CC: (none) => ngompa13

Comment 9 Morgan Leijström 2019-08-11 23:48:21 CEST
Should be in errata until we solve it, right?
Because some advanced user may try it even if it is not automatically offered.

Keywords: (none) => FOR_ERRATA7

Comment 10 Thierry Vignaud 2019-08-12 10:11:32 CEST
No I don't think it worth of errata.
1) I cannot reproduce it…
2) It looks like a very rare case: upgrade fails b/c of missing space and thus upgrade transaction fails in middle which means that rpm pkgs are a bad mix as priority yograde

What's the result of?
rpm -q rpm urpmi lib64db5.3

Keywords: (none) => NEEDINFO
CC: (none) => thierry.vignaud

Thierry Vignaud 2019-08-12 11:40:49 CEST

Keywords: FOR_ERRATA7 => (none)

Comment 11 r howard 2019-08-13 01:49:42 CEST
If it is a free space issue then maybe the upgrade process should first detect how much free space is available and if there is not enough, issue a message to that effect and how to rectify the situation and then exit.

CC: (none) => rihoward1

Comment 12 papoteur 2019-08-15 13:58:18 CEST
(In reply to Thierry Vignaud from comment #10)
> No I don't think it worth of errata.
> 1) I cannot reproduce it…
> 2) It looks like a very rare case: upgrade fails b/c of missing space and
> thus upgrade transaction fails in middle which means that rpm pkgs are a bad
> mix as priority yograde
> 
> What's the result of?
> rpm -q rpm urpmi lib64db5.3


rpm -q rpm urpmi lib64db5.3
rpm-4.14.2.1-12.mga7
urpmi-8.118-1.mga7
lib64db5.3-5.3.28-17.mga7

Lack of free space can be an explanation. It is combined with the  dnf-makecache.service, because at each time the error occured, this service was running.
Should not the libdb catch such an out of space?
Comment 13 Jérôme Hénin 2019-09-01 13:47:32 CEST
I have the very same issues here. I first had the segfault in libdb when running rpm (through either urpmi or dnf), then "Installation failed" due to packages that need to be removed:


Packages python2-rpm-4.14.2.1-12.mga7.x86_64, perl-5.28.2-1.mga7.x86_64, rpm-build-4.14.2.1-12.mga7.x86_64, vim-enhanced-8.1.1048-1.mga7.x86_64 are already installed
The following packages have to be removed for others to be upgraded:
meta-task-7-1.1.mga7.noarch
 (in order to install meta-task-7-1.1.mga7.noarch)
perl-MDV-Distribconf-4.101.0-2.mga7.noarch
 (in order to install perl-MDV-Distribconf-4.101.0-2.mga7.noarch)
perl-base-5.28.2-1.mga7.x86_64
 (in order to install perl-base-5.28.2-1.mga7.x86_64)
python3-rpm-4.14.2.1-12.mga7.x86_64
 (in order to install python3-rpm-4.14.2.1-12.mga7.x86_64)
rpm-4.14.2.1-12.mga7.x86_64
 (in order to install rpm-4.14.2.1-12.mga7.x86_64)


installing meta-task-7-1.1.mga7.noarch.rpm perl-MDV-Distribconf-4.101.0-2.mga7.noarch.rpm python3-rpm-4.14.2.1-12.mga7.x86_64.rpm perl-base-5.28.2-1.mga7.x86_64.rpm rpm-4.14.2.1-12.mga7.x86_64.rpm from /var/cache/urpmi/rpms
Installation failed:    rpm = 1:4.13.1-3.2.mga6 is needed by (installed) python2-rpm-1:4.13.1-3.2.mga6.x86_64
        rpm = 1:4.13.1-3.2.mga6 is needed by (installed) rpm-build-1:4.13.1-3.2.mga6.x86_64
        perl-base = 2:5.22.3-3.2.mga6 is needed by (installed) perl-2:5.22.3-3.2.mga6.x86_64
        perl-base = 2:5.22.3 is needed by (installed) vim-enhanced-8.0.388-1.mga6.x86_64
        perl-base = 2:5.22.3-3.2.mga6 is needed by (installed) perl-2:5.22.3-3.2.mga6.x86_64
        perl-base = 2:5.22.3 is needed by (installed) vim-enhanced-8.0.388-1.mga6.x86_64
        rpm = 1:4.13.1-3.2.mga6 is needed by (installed) python2-rpm-1:4.13.1-3.2.mga6.x86_64
        rpm = 1:4.13.1-3.2.mga6 is needed by (installed) rpm-build-1:4.13.1-3.2.mga6.x86_64

CC: (none) => heninj

Comment 14 Jérôme Hénin 2019-09-01 13:48:29 CEST
I should have added I have plenty of free disk space (17G on /).
Comment 15 papoteur 2019-09-01 14:28:59 CEST
(In reply to Marja Van Waes from comment #4)
> Please attach log.txt that is the result of running, as root:
> 
>    journalctl -a --since="YYYY-MM-DD hh:mm" --until="YYYY-MM-DD hh:mm" >
> log.txt
> 
> and adjust the --since time to right before you started to upgrade and the
> --until time to shortly after you finished upgrading and working around the
> issues.
> 

Hello Jérôme,
Can you join the extract from journal as asked above by Marja?
Comment 16 Jérôme Hénin 2019-09-01 22:54:07 CEST
(In reply to papoteur from comment #15) 
> Hello Jérôme,
> Can you join the extract from journal as asked above by Marja?

Sorry, unfortunately I cannot :-(
I had to reinstall the machine to get it in working order. This was not supposed to be an experimental box, this is my everyday production machine. I expected a smooth upgrade.
Comment 17 AL13N 2019-09-02 08:25:35 CEST
I did the upgrade, ... while it was not exactly smooth; it did run...

There were some conflicts though:
 - firefox (seemed to want to install v67 and complained about it conflicting with v68 (my installed version was v60))
 - after this, it wanted to remove some to be able to install others, which was fine for me.
 - i did notice that libreoffice and javatools was pulled in in the priority upgrade, which was odd...
 - due to using urpmi-proxy, i did have troubles with big data packages during the upgrade, i should really fix the timeout issues...

but, no segfault, or anything... i did have plenty of space.

I should note that i did this after logging out and in a tty.

CC: (none) => alien

Comment 18 Morgan Leijström 2019-09-26 11:14:31 CEST
Is this the same issue? :


Three days ago i attempted an upgrade in runlevel 3 having downloaded everything first using urpmi --test.

At 402/822 it errored: (translated from am mix of english and swedish)

error: rpmdb: DB_LOCK->lock_put: Lock is no longer valid

error: db5-error(22) from dbcursor->c_close Illegal argument


Logged it by photo of screen.

64 bit system, i7-3770, 16GB RAM, partitions except boot are in LVM in LUKS on SSD


I had lots of space left everywhere; /,  /tmp, /var, /boot

And system was running rock solid in every way before (recently got new CPU and RAM as the old CPU got rouge)


issued rpm --rebuild, rebooted, retried (at runlevel 3), but it is busted:


Int64.c: loadable library and perl binaries are mismatched

As this is my workstation i did not have time to sort it out and instead did a fresh install.  It was about time anyway - last fresh was mga4 and i had a lot of junk.
Rémi Verschelde 2019-10-05 14:47:51 CEST

Blocks: (none) => 25476

Rémi Verschelde 2019-10-05 14:49:00 CEST

Blocks: 25476 => 25528

Comment 19 Morgan Leijström 2019-10-06 17:38:41 CEST
*** Bug 25531 has been marked as a duplicate of this bug. ***

CC: (none) => domisse

Comment 20 Martin Whitaker 2019-10-19 23:25:52 CEST
(In reply to Neal Gompa from comment #8)
> This is definitely a coincidence. DNF's makecache service has zero
> interaction with BDB in this manner. All it does is download metadata and
> process it to generate the cache.

strace -e trace=open dnf makecache tells me different:

...
open("/var/lib/rpm/.dbenv.lock", O_RDWR|O_CREAT, 0644) = 3
open("/var/lib/rpm/DB_CONFIG", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/var/lib/rpm/__db.001", O_RDWR|O_CREAT|O_EXCL, 0644) = -1 EEXIST (File exists)
open("/var/lib/rpm/__db.001", O_RDWR)   = 4
open("/var/lib/rpm/__db.001", O_RDWR)   = 4
open("/var/lib/rpm/__db.002", O_RDWR|O_CREAT, 0644) = 5
open("/var/lib/rpm/__db.002", O_RDWR|O_CREAT, 0644) = 5
open("/var/lib/rpm/__db.003", O_RDWR|O_CREAT, 0644) = 6
...

And 'ls -l /var/lib/rpm' shows that it has modified the __db.* files.

CC: (none) => mageia

Comment 21 Martin Whitaker 2019-10-19 23:37:19 CEST
And manually running 'dnf makecache' during the erase phase of the priority updates reproduces the error.
Comment 22 papoteur 2019-10-20 08:17:54 CEST
(In reply to Martin Whitaker from comment #21)
> And manually running 'dnf makecache' during the erase phase of the priority
> updates reproduces the error.

I just checked bug 25531, and yes, there is also a run of dnf.makecache just before the error with rpm db.
Thank you Martin to confirm really what I was suspecting in comment 7.
I will update the wiki/release notes.
Do you think if it is possible to force the disabling before launching the upgrade, thus the error never occurs even if the user forgot about dnf?
Comment 23 Thomas Backlund 2019-10-20 11:02:23 CEST
for anyone not using dnf, simply urpme dnf and  be done with it...

Or for those wanting to keep it around would be:

systemctl stop dnf-makecache && systemctl disable dnf-makecache && systemctl mask dnf-makecache  

and then the timer:

systemctl stop dnf-makecache.timer && systemctl mask dnf-makecache.timer && systemctl daemon-reload

CC: (none) => tmb

Comment 24 papoteur 2019-10-20 11:11:00 CEST
(In reply to papoteur from comment #22)
> Do you think if it is possible to force the disabling before launching the
> upgrade, thus the error never occurs even if the user forgot about dnf?

I would suggest to insert commands in http://gitweb.mageia.org/software/mgaonline/tree/mgaapplet-upgrade-helper#n210
Comment 25 Martin Whitaker 2019-10-20 13:14:57 CEST
Changing the upgrade applet won't protect people who upgrade using urpmi.

IIUC, if the dnf package deleted /var/cache/dnf when it was upgraded, that would stop the dnf-makecache service doing anything until dnf was (next) used.

Neil, WDYT?
Thierry Vignaud 2019-10-28 16:31:47 CET

Source RPM: urpmi => urpmi, dnf

Comment 26 Thierry Vignaud 2019-10-28 16:33:36 CET
On the other hand, people manually running urpmi probably knows what to do.
I now understand why I never saw this bug as I don't install dnf
Comment 27 Neal Gompa 2019-11-03 19:08:39 CET
(In reply to Martin Whitaker from comment #20)
> (In reply to Neal Gompa from comment #8)
> > This is definitely a coincidence. DNF's makecache service has zero
> > interaction with BDB in this manner. All it does is download metadata and
> > process it to generate the cache.
> 
> strace -e trace=open dnf makecache tells me different:
> 
> ...
> open("/var/lib/rpm/.dbenv.lock", O_RDWR|O_CREAT, 0644) = 3
> open("/var/lib/rpm/DB_CONFIG", O_RDONLY) = -1 ENOENT (No such file or
> directory)
> open("/var/lib/rpm/__db.001", O_RDWR|O_CREAT|O_EXCL, 0644) = -1 EEXIST (File
> exists)
> open("/var/lib/rpm/__db.001", O_RDWR)   = 4
> open("/var/lib/rpm/__db.001", O_RDWR)   = 4
> open("/var/lib/rpm/__db.002", O_RDWR|O_CREAT, 0644) = 5
> open("/var/lib/rpm/__db.002", O_RDWR|O_CREAT, 0644) = 5
> open("/var/lib/rpm/__db.003", O_RDWR|O_CREAT, 0644) = 6
> ...
> 
> And 'ls -l /var/lib/rpm' shows that it has modified the __db.* files.

I think this is happening because the dnf makecache action is regenerating the solver cache of the rpmdb (@System.solv* files).

I'm currently investigating if the patches in libdnf and dnf upstream that disable generating the @System.solv cache would eliminate this problem.

If it does, then I could attempt backporting this to Mageia 6's DNF to resolve the problem.
Comment 28 Martin Whitaker 2019-11-03 19:22:27 CET
(In reply to Neal Gompa from comment #27)
> If it does, then I could attempt backporting this to Mageia 6's DNF to
> resolve the problem.

I believe the error occurs after dnf has been upgraded, so wouldn't the fix be needed in Mageia 7's DNF?
Comment 29 Thomas Backlund 2019-11-03 22:10:38 CET
Just face it, 

its way safer & simpler to tell people wanting to use urpmi or mgaonline to do the upgrade to "urpme dnf"

If they want dnf in mageia 7 they can re-install it after.
Comment 30 Morgan Leijström 2019-11-03 22:20:22 CET
Could the update app be made to uninstall dnf?
Comment 31 Neal Gompa 2019-11-05 00:36:16 CET
(In reply to Martin Whitaker from comment #28)
> (In reply to Neal Gompa from comment #27)
> > If it does, then I could attempt backporting this to Mageia 6's DNF to
> > resolve the problem.
> 
> I believe the error occurs after dnf has been upgraded, so wouldn't the fix
> be needed in Mageia 7's DNF?

It'd be needed on both sides, because urpmi does not lock the rpmdb persistently due to its split transaction behavior.

The reason this even happens is because you can get unlucky with urpmi releasing and opening the rpmdb multiple times throughout a urpmi run. Since it doesn't persistently lock it throughout the entire run (it can't, since it does multiple transactions...), this can happen since dnf can be in a situation where it's not told the rpmdb is locked.

(In reply to Thomas Backlund from comment #29)
> Just face it, 
> 
> its way safer & simpler to tell people wanting to use urpmi or mgaonline to
> do the upgrade to "urpme dnf"
> 
> If they want dnf in mageia 7 they can re-install it after.

The "workaround" solution would be to just do "systemctl stop dnf-makecache.timer" before starting the urpmi run. That will prevent it from running for the duration of the system session.
Comment 32 Thierry Vignaud 2019-11-05 09:23:55 CET
That's a joke as urpmi: see comment #21:

> And manually running 'dnf makecache' during the erase phase of the priority
> updates reproduces the error.

AKA there's only one transaction there (the priority upgrade one) and dnf makecache breaks havoc while it's still running

Also urpmi locks /var/lib/rpm/.RPMLOCK in exclusive mode for the its entire duration:
It locks :
my $rpm_lock = !$env && !$options{nolock} && urpm::lock::rpm_db($urpm, 'exclusive', wait => $options{wait_lock});

 => flock(5, LOCK_EX|LOCK_NB)               = 0

    $rpm_lock->unlock if $rpm_lock;
being called just before exiting.

Dnf should be patched to lock that file too...
Comment 33 papoteur 2019-11-14 17:47:51 CET
(In reply to Neal Gompa from comment #31)

> 
> The "workaround" solution would be to just do "systemctl stop
> dnf-makecache.timer" before starting the urpmi run. That will prevent it
> from running for the duration of the system session.

Thus this has to be implemented in mgaonline, and we could then open the upgrade way. It can be perhaps a specific branch for Mageia 6.
Who can do that?

I pointed previously this place:
http://gitweb.mageia.org/software/mgaonline/tree/mgaapplet-upgrade-helper#n210
Comment 34 Thierry Vignaud 2019-11-15 18:17:22 CET
Nope, dnf must be fixed
Comment 35 Neal Gompa 2019-11-19 01:36:53 CET
(In reply to Thierry Vignaud from comment #32)
> That's a joke as urpmi: see comment #21:
> 
> > And manually running 'dnf makecache' during the erase phase of the priority
> > updates reproduces the error.
> 
> AKA there's only one transaction there (the priority upgrade one) and dnf
> makecache breaks havoc while it's still running
> 
> Also urpmi locks /var/lib/rpm/.RPMLOCK in exclusive mode for the its entire
> duration:
> It locks :
> my $rpm_lock = !$env && !$options{nolock} && urpm::lock::rpm_db($urpm,
> 'exclusive', wait => $options{wait_lock});
> 
>  => flock(5, LOCK_EX|LOCK_NB)               = 0
> 
>     $rpm_lock->unlock if $rpm_lock;
> being called just before exiting.
> 
> Dnf should be patched to lock that file too...

I've read the code in urpmi for the .RPMLOCK file. It's a bogus lock (in the rpmdb sense). It doesn't lock the rpmdb, it only prevents urpmi from running concurrently. So you're saying I should have DNF recognize and create the urpmi lock? It's not like DNF isn't creating the standard rpmdb lock, so why isn't urpmi doing the same?

When DNF creates its lock, it also initializes an rpmdb lock in /var/lib/rpm/.dbenv.lock through the librpm API. It also creates an rpmdb lock in /var/lib/dnf/rpmdb_lock.pid.

It additionally creates a DNF lock in /var/cache/dnf/metadata_lock.pid during cache regeneration.

Finally, there's a transaction lock created in /var/lib/dnf/rpmdb_lock.pid during a transaction.

(In reply to Thierry Vignaud from comment #34)
> Nope, dnf must be fixed

But if *you* think DNF should make that lock, then we could add a DNF plugin to do so. What behavior do you want? Check if it exists and quit, and if it doesn't exist, create the file to prevent urpmi racing against it?
Martin Whitaker 2019-11-20 11:25:53 CET

See Also: (none) => https://bugs.mageia.org/show_bug.cgi?id=25711

Comment 36 Martin Whitaker 2019-11-20 11:31:05 CET
As discussed and agreed on dev@ml, the workaround has been added to mgaonline in Mageia 6. It would be better to implement proper locking between urpmi and dnf for Mageia 7, as the bug also affects urpmi upgrades performed from the command line.

Blocks: 25528 => (none)

Comment 37 Thierry Vignaud 2019-11-20 13:04:28 CET
(In reply to Neal Gompa from comment #35)
> I've read the code in urpmi for the .RPMLOCK file. It's a bogus lock (in the
> rpmdb sense). It doesn't lock the rpmdb, it only prevents urpmi from running
> concurrently. So you're saying I should have DNF recognize and create the
> urpmi lock? It's not like DNF isn't creating the standard rpmdb lock, so why
> isn't urpmi doing the same?
>
> When DNF creates its lock, it also initializes an rpmdb lock in
> /var/lib/rpm/.dbenv.lock through the librpm API. It also creates an rpmdb
> lock in /var/lib/dnf/rpmdb_lock.pid.

a "bogus lock (in the rpmdb sense)"???
sight.
Please read again before doing such offensive comments.
Just lock at urpmi & rpmlib code or try strace.
rpmdb is locked when running the transaction.

Or just look at "perldoc urpm::lock"
See that we've both rpm_db() & urpmi_db().

Urpmi *BOTH* lock the rpmdb & take a larger/surrounding lock on .RPMLOCK
Comment 38 Aurelien Oudelet 2020-09-19 18:09:09 CEST
Hi,
This is High priority bug for a good reason.

Making Mageia even better than ever is best direction.
In order to do right thing, this bug should be examined and fixed as soon as possible.

Packagers, please make the status to Assigned when you are working on this.
Feel free to reassign the bug if bad-triaged. Also, if bug is old, please close it.

On October 1st 2020, we will drop priority to normal.
Comment 39 papoteur 2021-01-27 07:19:43 CET
Hello,
Is there something to check about this bug for Coming Mageia 8 ?
Comment 40 Morgan Leijström 2021-01-27 10:54:51 CET
I think we should update the upgrade instructions given in release notes to tell users to uninstall dnf (easiest for every user), if other upgrade method than dnf is used.

OR

We set this bug to release blocker and solve it.

Keywords: (none) => FOR_RELEASENOTES8

Comment 41 Aurelien Oudelet 2021-01-27 12:00:16 CET
(In reply to papoteur from comment #39)
> Hello,
> Is there something to check about this bug for Coming Mageia 8 ?

I did a test to upgrade a M7 Plasma under a VM to M8 via mgaapplet tool (as discussed in QA ML).
I did not shutdown dnf makecache timer myself.
The system upgrades as fast as I have very good Internet connection and all RPMs were upgraded before the timer triggers...

I should test to let run the VM under M7 for 50 minutes, run later the upgrade path and see if the timer is triggered.


Why not only deactivate the .timer while running mgaapplet/urpmi instead of uninstall dnf?

Keywords: FOR_RELEASENOTES8, NEEDINFO => (none)
CC: (none) => ouaurelien

Comment 42 Morgan Leijström 2021-03-10 19:08:38 CET
Is there something to improve in
https://wiki.mageia.org/en/Mageia_8_Release_Notes#Upgrading_from_Mageia_7
regarduing this bug?
Comment 43 papoteur 2021-03-11 09:33:48 CET
(In reply to Morgan Leijström from comment #42)
> Is there something to improve in
> https://wiki.mageia.org/en/Mageia_8_Release_Notes#Upgrading_from_Mageia_7
> regarduing this bug?

Hi Morgan,
I think that the trap can catch also the ones updating through mgaonline.
Comment 44 Morgan Leijström 2021-03-11 10:50:00 CET
So what should we recommend?
Seem easiest for normal users is to check if dnf is installed, and if so uninstall dnf?

But of course even better if an update can go out on mga7 updates to fix it, maybe make it deactivate that timer?
Comment 45 Morgan Leijström 2021-04-07 23:20:19 CEST
(In reply to papoteur from comment #43)
> I think that the trap can catch also the ones updating through mgaonline.

As I read it, mgaonline got fixed:

(In reply to Martin Whitaker from comment #36)
> As discussed and agreed on dev@ml, the workaround has been added to
> mgaonline in Mageia 6.

> It would be better to implement proper locking
> between urpmi and dnf for Mageia 7, as the bug also affects urpmi upgrades
> performed from the command line.

   -> yes, i set it as status.

For now it is in release notes, as said.

Status comment: (none) => implement proper locking between urpmi and dnf
Version: 7 => 8
Keywords: (none) => IN_RELEASENOTES8
Whiteboard: (none) => MGA7TOO

Morgan Leijström 2022-06-01 00:25:32 CEST

Target Milestone: --- => Mageia 9

Comment 46 Morgan Leijström 2022-12-13 18:25:55 CET
Any news?

Whiteboard: MGA7TOO => MGA8TOO
Version: 8 => Cauldron


Note You need to log in before you can comment on or make changes to this bug.