Bug 26827 - urpmi fails due to erroneous downloads in cache, normal users get stuck
Summary: urpmi fails due to erroneous downloads in cache, normal users get stuck
Status: NEW
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: High enhancement
Target Milestone: ---
Assignee: Mageia tools maintainers
QA Contact:
URL:
Whiteboard: MGA7TOO
Keywords:
Depends on:
Blocks:
 
Reported: 2020-06-19 17:02 CEST by Morgan Leijström
Modified: 2020-07-02 09:39 CEST (History)
7 users (show)

See Also:
Source RPM: urpmi
CVE:
Status comment:


Attachments

Description Morgan Leijström 2020-06-19 17:02:09 CEST
Created from https://ml.mageia.org/l/arc/dev/2020-06/msg00309.html


Description of problem:
When an update fails, users restart a new update and find the same error.
They think that a new install is necessary.
There is no hint given to the user about how to solve it.
"package xxx does not verify: Payload SHA256 digest: BAD" is not much help...


How reproducible, Steps to Reproduce:
This may happen on a network break.


Manual workaround:
It is enough to do "urpmi --clean" to delete bad rpms in cache.

   * BUT NORMAL USERS DO NOT KNOW *  and think the system is corrupt.


Suggestion:
If urpmi detects that rpms from cache are bad, it should delete them and retry.

If an automatic clean can be done without drawback, please implement it.

Another way is to ask the user on error, if they would like to retry with clean cache.

A third way is to use a checkbox "clean the cache".


Examples from bugzilla, mistaken for other faults
 - even us more advanced users waste time on this trap:
Bug 26159
Bug 26323
Bug 25949
Bug 25650
Bug 25111
Bug 25647


Side note from the mail thread:
It seems that the bad rpms are very frequent in cache, more than in the past. It deserve surely to look for what is the main
reason.
Morgan Leijström 2020-06-19 17:06:02 CEST

Assignee: bugsquad => mageiatools
Priority: Normal => High
Whiteboard: (none) => MGA7TOO

Comment 1 Morgan Leijström 2020-06-19 18:28:29 CEST
Also from that mail thread:

"
One way to avoid network errors is to put «download-all» on a single line by itself inside the top curly brackets in /etc/urpmi/urpmi.cfg
Then the installation won't start untill all rpms are downloaded.
This presumes that the cache directory have enough free space to hold all the needed rpms in one go..

I.e.
{
download-all
}

  "


Maybe that can be made a check box in some GUI?
Set the checkbox according to file, then let user toggle it.
Comment 2 Lewis Smith 2020-06-19 21:47:56 CEST
We have had similar bugs before, where pkg installations or updates repeatedly fail because of a download corruption which stays in the rpm cache; resolved as Morgan notes by clearing the cache:
 # urpmi --clean
and re-starting the operation.
Comment 3 Pierre Jarillon 2020-06-20 00:56:05 CEST
# urpmi --clean is the solution but how a new user can guess this?
If a corrupt package is found it should be deleted. If not, each new update is locked.

CC: (none) => jarillon

Comment 4 Dave Hodgins 2020-06-20 01:44:31 CEST
The problem is that urpmi doesn't know if the package is corrupt because the
download was interrupted (so not all there), corrupted during download, or if
the package on the mirror is corrupt.

We don't want to repeatedly download the package if it's corrupt on the mirror,
as that could cost people money for exceeding data caps. That's one of the
major problems I've seen with windows.

If the file download was interrupted, the choice should be to resume. For
corrupted during download, the only option is to clean. For corrupt on the
mirror, it should be reported it and then wait till it's fixed.

We need a better way to ensure users are informed of the situation and the
options.

Perhaps add a file with the explanation, and alter the message to display that
file, either on the terminal output or in a gui dialog with the option to
either try resuming the download or to clean the cache, and explaining that
if the error occurs again, it should be reported.

CC: (none) => davidwhodgins

aguador 2020-06-20 08:25:05 CEST

CC: (none) => waterbearer54

Comment 5 Bernard SIAUD 2020-06-20 11:10:36 CEST
With my cauldron, I sometimes have a similar problem.
When a rpm is not good, or when some dependencies are missing, nothing comes up. It would be good to install rpm anyway.

CC: (none) => liste

Comment 6 papoteur 2020-06-20 11:47:42 CEST
From Liam Quin:
>
>> This happens on a network break.
>
> One way to avoid network errors is to put «download-all» on a single
> line by itself inside the top curly brackets in /etc/urpmi/urpmi.cfg
> Then the installation won't start untill all rpms are downloaded.
> This presumes that the cache directory have enough free space to hold
> all the needed rpms in one go..

I don't think that's a good presumption - and if it is wrong, a full /
partition is also a difficult problem for a new user.

A laptop, often with a small root partition, might travel a lot,
perhaps being restarted several times a day for a student.

The answer is that if an rpm file is detected as corrupted, then
(1) it's incomplete, should be in the "partial" folder, and could be
resumed automatically, or
(2) it's corrupted and should be deleted, or
(3) you need to upgrade rpm itself as there was a backwards
incompatibility introduced, or
(4) it's corrupted on the mirror(s).

In cases (3) and (4) it is necessary to skip the file and move on.
In case (1) urpmi should resume automatically.
Detecting the difference between case (2), local corruption, and case
(4), server-side corruption, maybe means checking the checksum from the
server first and trying again only if they differ, right?

CC: (none) => yves.brungard_mageia

Comment 7 Nicolas Lécureuil 2020-06-20 23:45:55 CEST
way to reproduce: 

Pick a package you don't have installed. I choose 0ad-data because it was at the top of the list (and I have a local repo, so don't care how big it is). Then:

# urpmi --noinstall 0ad-data
# echo bad >> /var/cache/urpmi/rpms/0ad-data-0.0.23b-1.mga7.noarch.rpm
# urpmi --test 0ad-data
installing 0ad-data-0.0.23b-1.mga7.noarch.rpm from /var/cache/urpmi/rpms
Preparing... #########################################################################################################################################
Installation failed:    package 0ad-data-1:0.0.23b-1.mga7.noarch does not verify: Payload SHA256 digest: BAD (Expected 9d1d05c953161efc409127674c852bf2fd2d30c9285eb6d7fe9129e7198e5035 != dfaceab15dcd59f2f91c9b2730d57c8487b2e6bd0585966783ced62eb5e372cf)

CC: (none) => mageia

Marc Mascré 2020-06-21 10:26:10 CEST

CC: (none) => marc

Comment 8 Lewis Smith 2020-06-22 21:41:06 CEST
Well, Morgan has certainly started a worms' nest!
I think Papoteur's comment 6 gets nearest. The root of the problem is that when urpmi detects a bad checksum, it just gives up. We know that this is most common with bad downloads, and the neat solution is to delete the offending package from the cache, & re-download it.
[--clean is a sledgehammer to crack a nut, re-downloading a lot of OK stuff].

If the re-download yields the same failure, what then?
Remove the package from the list (which should take with it any dependent pkgs), inform the user, and do the others.
User to inform Mageia about the bad package on the mirror.
Comment 9 papoteur 2020-07-02 09:39:51 CEST
(In reply to Lewis Smith from comment #8)
> I think Papoteur's comment 6 gets nearest.
I just reported what Liam Quin commented.

Note You need to log in before you can comment on or make changes to this bug.