Bug 25992 - perl-URPM bug fix: add support for reading zstd compressed files
Summary: perl-URPM bug fix: add support for reading zstd compressed files
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: 7
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: QA Team
QA Contact:
URL:
Whiteboard: MGA7-64-OK
Keywords: advisory, validated_update
Depends on: 25979
Blocks:
  Show dependency treegraph
 
Reported: 2019-12-30 03:34 CET by Thierry Vignaud
Modified: 2020-01-07 22:21 CET (History)
5 users (show)

See Also:
Source RPM: rpm-4.14.2.1-13.mga7
CVE:
Status comment:


Attachments
enable to alter hdlist compressor in genhdlist2 (not really wanted) (2.08 KB, patch)
2020-01-06 23:52 CET, Thierry Vignaud
Details | Diff

Description Thierry Vignaud 2019-12-30 03:34:33 CET
Advisory:
========================
This update of perl-URPM enables urpmi to parse zstd compressed media metadata thus enabling to switch mga8 to use such zstd compressed metadata

========================

Updated packages in core/updates_testing:
========================
perl-URPM-5.23-1.mga7
Thierry Vignaud 2019-12-30 03:35:04 CET

Assignee: bugsquad => qa-bugs

Comment 1 PC LX 2019-12-31 14:17:45 CET
Installed and tested without issue.

The package perl-URPM was updated along with the rpm package updates. 
Tested with urpmi and rpmdrake tools. No issues found.


----------------
To satisfy dependencies, the following packages are going to be installed:
  Package                        Version      Release       Arch    
(medium "Core Updates Testing")
  lib64rpm8                      4.14.2.1     13.mga7       x86_64  
  perl-URPM                      5.23         1.mga7        x86_64  
  python2-rpm                    4.14.2.1     13.mga7       x86_64  
  python3-rpm                    4.14.2.1     13.mga7       x86_64  
  rpm                            4.14.2.1     13.mga7       x86_64  
  rpm-plugin-ima                 4.14.2.1     13.mga7       x86_64  
  rpm-plugin-syslog              4.14.2.1     13.mga7       x86_64  
  rpm-plugin-systemd-inhibit     4.14.2.1     13.mga7       x86_64  
389KB of disk space will be freed.
996KB of packages will be retrieved.
Proceed with the installation of the 8 packages? (Y/n) 
----------------

$ uname -a
Linux marte 5.4.6-desktop-2.mga7 #1 SMP Mon Dec 23 12:05:27 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ rpm -q perl-URPM
perl-URPM-5.23-1.mga7

CC: (none) => mageia

Comment 2 Thomas Andrews 2020-01-05 16:47:04 CET
Sending this along based on the test in Comment 1.

Validating. Advisory in Comment 0.

CC: (none) => andrewsfarm, sysadmin-bugs
Keywords: (none) => validated_update
Whiteboard: (none) => MGA7-64-OK

Comment 3 Thomas Backlund 2020-01-05 23:55:30 CET
Unvalidating, as I want to test the metadata first

But I realized our genhdlist2 only creates gzip hdlists regardless of flags...
Only synthesis format changes.

 genhdlist2 --xml-info-filter ".zstd:zstd -19 -T8" --synthesis-filter ".zstd:zstd -19 -T8" .
adding 1 new rpms not available in existing hdlist
replacing ./media_info/hdlist.cz with hdlist.cz.tmp
replacing ./media_info/synthesis.hdlist.zstd with synthesis.hdlist.zstd.tmp
updating ./media_info/MD5SUM

$ file media_info/*
media_info/hdlist.cz:             gzip compressed data, max compression, from Unix, original size modulo 2^32 2053332272 gzip compressed data, unknown method, has CRC, was "", has comment, encrypted, from FAT filesystem (MS-DOS, OS/2, NT), original size modulo 2^32 2053332272
media_info/MD5SUM:                ASCII text
media_info/synthesis.hdlist.zstd: Zstandard compressed data (v0.8+), Dictionary ID: None



So I cant verify the hdlist part, only the synthesis is tested for now...



We need to fix genhdlist2.

and we need to decide the hdlist/synthersis name, as it currently does not autodetect  .zstd

LANG=C urpmi.addmedia test /home/tmb/x86_64  
adding medium "test"
adding 1 new rpms not available in existing hdlist
replacing /var/cache/urpmi/partial/synthesis.hdlist.cz with synthesis.hdlist.cz.tmp
updating /var/cache/urpmi/partial/MD5SUM



Only if I add full media_info stuff:
# LANG=C urpmi.addmedia test /home/tmb/x86_64  with media_info/synthesis.hdlist.zstd
adding medium "test"


or is it simpler to use ".cz" regardless of compression format ?

CC: (none) => tmb
Keywords: validated_update => feedback

Comment 4 Thomas Andrews 2020-01-06 00:20:06 CET
(In reply to Thomas Backlund from comment #3)
> Unvalidating, as I want to test the metadata first
> 
Good to know you have our backs when we are out of our depth, Thomas.

Thanks.
Comment 5 Thomas Backlund 2020-01-06 00:36:11 CET
(In reply to Thomas Backlund from comment #3)

> 
> But I realized our genhdlist2 only creates gzip hdlists regardless of
> flags...
> Only synthesis format changes.
> 

yeah, gzip is hardcoded in "sub build {}"

I guess we need to add some "--hdlist-filter" flag to override it
Comment 6 Thomas Backlund 2020-01-06 00:40:52 CET
(In reply to Thomas Andrews from comment #4)
> (In reply to Thomas Backlund from comment #3)
> > Unvalidating, as I want to test the metadata first
> > 
> Good to know you have our backs when we are out of our depth, Thomas.
> 
> Thanks.

No worries :)
Comment 7 Thomas Backlund 2020-01-06 13:12:28 CET
(In reply to Thomas Backlund from comment #3)
> We need to fix genhdlist2.
> 
> and we need to decide the hdlist/synthersis name, as it currently does not
> autodetect  .zstd

> 
> or is it simpler to use ".cz" regardless of compression format ?

Looking at current cauldron repo, we have in media_info for core:

20200106-103818-changelog.xml.lzma@
20200106-103818-files.xml.lzma@
20200106-103818-hdlist.cz@
20200106-103818-info.xml.lzma@
20200106-103818-synthesis.hdlist.cz@
changelog.xml.lzma
files.xml.lzma
hdlist.cz
info.xml.lzma
MD5SUM
pubkey
synthesis.hdlist.cz


Should we unify all to zstd compression ? and do we name them .zstd or .cz ?
Comment 8 Thierry Vignaud 2020-01-06 23:16:51 CET
We don't really care about hdlists.
Only synthesis are used by urpmi for deps computations.
hdlists are more for heavy querying

Cz is just an old convention, the format is autodetected
We already switched from gzip to lzma then xz, we don't want to rename them, we've assumptions everywhere.
Comment 9 Thierry Vignaud 2020-01-06 23:52:28 CET
Created attachment 11447 [details]
enable to alter hdlist compressor in genhdlist2 (not really wanted)

which show that URPM would need some patching for reading hdlist compressed in anything other than hdlist.
I could do it but I doubt it worths it…
Comment 10 Thomas Backlund 2020-01-07 01:15:24 CET
ok, so to sum up

hdlist is still .cz, still compressed with gzip
xml files are still .lzma, still compressed with xz
synthesis is also still .cz, now compressed with zstd (for mga8 that is)

that way we keep potential breakage to a minimum / non-existant ...
Comment 11 Thierry Vignaud 2020-01-07 01:56:13 CET
The URPM issue is fixed in http://gitweb.mageia.org/software/rpm/perl-URPM/commit/?id=2afeac43e2a5c0d484100eedd192842d27944558 but as indicated, we don't need to alter hdlist compression, especially as it could seriously slow down generating metadata
For comparing for reference:
-rw-r--r-- 1 root 267M Gen   7 01:34 Mgz/hdlist.cz
-rw-r--r-- 1 root 197M Gen   7 01:32 Mxz/hdlist.cz
-rw-r--r-- 1 root 238M Gen   7 01:39 Mzstd/hdlist.cz
Comment 12 Thierry Vignaud 2020-01-07 01:57:57 CET
(In reply to Thomas Backlund from comment #10)
> ok, so to sum up
> 
> hdlist is still .cz, still compressed with gzip
> xml files are still .lzma, still compressed with xz
> synthesis is also still .cz, now compressed with zstd (for mga8 that is)
> 
> that way we keep potential breakage to a minimum / non-existant ...

yes XML files being lzma compressed (really xz) is hardcoded in urpm/xml_info.pm
Thomas Backlund 2020-01-07 21:49:21 CET

Keywords: feedback => advisory, validated_update

Comment 13 Mageia Robot 2020-01-07 22:21:12 CET
An update for this issue has been pushed to the Mageia Updates repository.

https://advisories.mageia.org/MGAA-2020-0010.html

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.