Bug 2157 - Switch to "standard" rpm metadata for package repositories
Summary: Switch to "standard" rpm metadata for package repositories
Status: RESOLVED WONTFIX
Alias: None
Product: Mageia
Classification: Unclassified
Component: Release (media or process) (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal enhancement
Target Milestone: Mageia 3
Assignee: Mageia Bug Squad
QA Contact:
URL: https://mageia.org/wiki/doku.php?id=i...
Whiteboard:
Keywords:
Depends on: 4799
Blocks: 5140
  Show dependency treegraph
 
Reported: 2011-07-15 21:00 CEST by Christiaan Welvaart
Modified: 2015-08-10 16:25 CEST (History)
9 users (show)

See Also:
Source RPM: rpmtools
CVE:
Status comment:


Attachments

Description Christiaan Welvaart 2011-07-15 21:00:00 CEST
Mageia 2 spec #034
also see https://mageia.org/wiki/doku.php?id=iso2:technical_specification

Mageia repositories currently have a media_info dir which contains a   
line-oriented 'synthesis' file, plus some XML files with additional
information like package descriptions, and a hdlist which AFAIK is not 
used by urpmi.

Fedora and opensuse repositories do not use this format for metadata but
instead have a repodata dir containing a 'primary' and other XML files,
which are all listed in a 'rpomd.xml' file.

The two schemes are similar which is not very surprising because the
requirements are the same. Most package managers with rpm support can use
the repomd style but urpmi can't. Only a few package managers support
synthesis style metadata. Switching to this 'standard' repository metadata
would give people more choice: use other package managers (e.g. yum or zypper) 
in mageia and use urpmi on other distros. In the long term this should help 
make urpmi easier to maintain: the "standard" metadata is a bit easier to extend, 
urpmi will be more a standard tool: behavior can easily be compared to other 
package managers, test suites may be shared.

Some not very useful index size numbers for mga cauldron x86_64 core/release :
1.7M    synthesis.hdlist.cz
1M      info.xml.lzma 
5,6M    changelog.xml.lzma
8,7M    files.xml.lzma

4,4M    primary.xml.gz
3,4M    other.xml.gz
11M     filelists.xml.gz (8,8M when compressed with xz)

Goals:
- no negative impact for people who use the default package tools
- basic support for other package managers (yum, zypper, apt) in mageia 2
  (better support where packagekit uses the package manager the user has 
  chosen is not part of this spec but could be a follow-up feature)

the plan is:
- modify urpmi to only support repomd metadata
- add repomd metadata to the cauldron repository while keeping synthesis
- upload the new urpmi
- after either the mga2 or mga3 release:
  drop synthesis/hdlist metadata from cauldron

so there will be 1 or 2 stable releases that carry both types of
repository metadata. The build infrastructure needs to support this of
course. This adds some complexity and uses extra space on the mirrors. 

Things that need to be changed:
- urpmi perl code
- maybe a fast xml reader in C, like yum has ?
- rpmdrake/installer ?
- build system

Open questions:
- Is there anything I missed, unique urpmi features that will be broken by
  such a change, other expected problems?
- A volunteer is needed for writing the needed perl code for perl-URPM etc., 
  otherwise the change won't happen.
Manuel Hiebel 2011-07-15 21:06:33 CEST

Blocks: (none) => 1994

Comment 1 Marja Van Waes 2011-10-28 07:39:18 CEST
a comment here https://mageia.org/wiki/doku.php?do=show&id=iso2%3Atechnical_specification :

sounds interesting but needs to have 2 kinds of metadata in parallel while it's not all integrated - discussions needed

URL: (none) => https://mageia.org/wiki/doku.php?id=iso2:technical_specification
CC: (none) => marja11

andré blais 2011-10-28 11:59:29 CEST

CC: (none) => andre999mga

Comment 2 Michael Scherer 2011-11-01 01:05:42 CET
I think this would be helpful, yes. 

I only have 2 thing to add :
1) is the format of repomd specified ? I remember seeing developpers of packagekit and zypper being unhappy with a change of format. 

2) can we use createrepo ? is this as fast our current tools ? The slow pace of update and indexes generation on fedora is IMHO  problematic ( slow pace being that we have to wait 2 days to get updates-testing on mirrors ), but I do not know if the problem is their tools or something else.

CC: (none) => misc

Comment 3 Marja Van Waes 2011-12-27 14:51:15 CET
@ cjw

can you answer misc's questions?
Comment 4 Christiaan Welvaart 2011-12-30 00:34:21 CET
AFAICT the format is standard but it can be extended.

I don't know enough about all the features that urpmi and GUI package tools provide in mageia. Createrepo provides at least the basic info required for package installation and upgrades. WRT speed we should probably be more concerned about the time it takes to read repomd data (in urpmi/perl-URPM).
Comment 5 Jeff Johnson 2012-01-07 18:38:10 CET
tracked at https://bugs.launchpad.net/rpm/+bug/913192

CC: (none) => n3npq

Comment 6 Manuel Hiebel 2012-02-03 18:53:34 CET
No news

Target Milestone: --- => Mageia 3
Summary: Switch to "standard" rpm metadata for package repositories in Mageia 2 => Switch to "standard" rpm metadata for package repositories
Source RPM: (none) => rpm
Severity: normal => enhancement

Comment 7 Thierry Vignaud 2012-02-22 18:38:26 CET
I'm against such a move.
yum metadata are bigger and slower to parse.

For updating packages, compare:
1.7M    synthesis.hdlist.cz
4,4M    primary.xml.gz

For retrieving metadata from rpmdrake, compare:
8,7M    files.xml.lzma
11M     filelists.xml.gz

And yum really is quite much slower than urpmi and I suspect the used metada format doesn't help...

As for the comment about recompressing with XZ, that wouldn't change the ration since synthesis would gain from changing compressor (we already support gz, bzip2, xz, ...)

Last but not least, it would have big impacts on perl-URPM, urpmi, rpmdrake and drakx installer.

CC: (none) => thierry.vignaud

Comment 8 Pascal Terjan 2012-03-05 11:19:57 CET
Also did someone check generation time?

Opensuse does not uses it on factory but only on the different smaller repositories and stable releases because "createrepo on factory takes hours and we prefer to push out ftp trees quicker".

CC: (none) => pterjan

Pascal Terjan 2012-03-05 11:29:56 CET

Depends on: (none) => 4799

Comment 9 Thierry Vignaud 2012-03-06 11:19:26 CET
I think we should close this one as WONTFIX

Source RPM: rpm => rpmtools

Comment 10 Marja Van Waes 2012-03-07 08:17:57 CET
(In reply to comment #9)
> I think we should close this one as WONTFIX

@ misc
@ cjw

If there are no objections, I intend to close this report as WONTFIX two weeks from now.

Whoever wants to object: please remove the NEEDINFO keyword and give your motivation

Keywords: (none) => NEEDINFO

Comment 11 Manuel Hiebel 2012-03-20 14:19:43 CET
keep for future spec

Keywords: NEEDINFO => (none)
Blocks: 1994 => (none)

Manuel Hiebel 2012-03-27 23:02:31 CEST

Blocks: (none) => 5140

Comment 12 Marja Van Waes 2012-05-26 13:06:37 CEST
Hi,

This bug was filed against cauldron, but we do not have cauldron at the moment.

Please report whether this bug is still valid for Mageia 2.

Thanks :)

Cheers,
marja

Keywords: (none) => NEEDINFO

Comment 13 Marja Van Waes 2012-06-11 20:11:15 CEST
(In reply to comment #11)
> keep for future spec

OK :)

Keywords: NEEDINFO => (none)

Comment 14 Nicolas Vigier 2012-06-11 20:17:27 CEST
As there is no plan to change this soon, I think it should be closed. It can still be reopened later if things change.

CC: (none) => boklm

Comment 15 Marja Van Waes 2012-06-11 20:40:50 CEST
(In reply to comment #14)
> As there is no plan to change this soon, I think it should be closed. It can
> still be reopened later if things change.

@ Leuhmanu

Sorry, tv and boklm for closing: you're the minority 

Closing as WONTFIX for now

Status: NEW => RESOLVED
Resolution: (none) => WONTFIX

Comment 16 Anderson Carvalho 2012-08-16 18:44:20 CEST
I would like to request the reopening of this bug! 

Switch to "standard" rpm metadata for package repositories.

This will allow us to use YUM and APT-RPM that are in repositories Mageia. In addition to improving support for other managers.

Standardization is important and is a current trend.

CC: (none) => frateraec

Comment 17 Pascal Terjan 2012-08-16 18:49:00 CEST
Unless someone works on generating those metadata at a decent speed, this is not going to happen.

Currently uploading a package takes a few minutes, it would take a few hours.
Comment 18 Anderson Carvalho 2012-08-16 18:56:59 CEST
Please advise as the other distros rpm do and how to do Mageia justify the impossibility.
Comment 19 Pascal Terjan 2012-08-16 19:01:55 CEST
As I quoted, opensuse does not do it on their main development repository because it takes several hours.

We may be able to do like them and only to it for stable releases but then it would not be consistent with the development version and not be properly tested.

I don't know how Fedora does it
Comment 20 Anderson Carvalho 2012-08-16 19:08:15 CEST
It could be for Mageia 3 or maybe Mageia 4 stable release?
Comment 21 Pascal Terjan 2012-08-16 19:53:15 CEST
Reading from an old RH advisory from 2008 there are some options that may help:

=====
As well, this updated package adds a new option flag, "--skip-stat". When
used in conjunction with "--update", createrepo skips calling stat(1) on
files to see if they have changed, and assumes that if the file name
matches, then the repodata can be re-used.

This option has shown a significant increase in performance in the Fedora
build system, cutting down the time it takes createrepo to run, from 40
minutes, to approximately 4.
=====

4 minutes is still quite high but more reasonable.

When uploading a package we have to run it 4 times (i586, x86_64, debug i586, debug x86_64) and currently the total of 4 runs of genhdlist2 takes about 3 minutes (and I had started working on making it faster).

Let's test the performance with that options.

On my home machine on core/release i586 (20115 packages):

First generation with genhdlist2 --clean

real	1m52.755s
user	1m36.790s
sys	0m1.950s

Update with genhdlist2 (and no new package)

real	1m0.889s
user	1m36.080s
sys	0m1.540s

First generation with createrepo

real	10m49.949s
user	6m15.240s
sys	0m12.380s

Update with createrepo --update --skip-stat (and no new package)

real	0m56.652s
user	0m49.190s
sys	0m3.470s


So creating from scratch (which does not happen so often, mostly when we have bugs causing signature problems) takes 5 times longer, for the 4 repositories it would be 44 minutes instead of 7 minutes 30 but updating incrementally (which happens at each package upload) is slightly faster.

Status: RESOLVED => REOPENED
Resolution: WONTFIX => (none)

Comment 22 Pascal Terjan 2012-08-16 20:01:58 CEST
Actually createrepo --update without --skip-stat is fast too :)

real	0m55.352s
user	0m49.310s
sys	0m3.330s
Comment 23 AL13N 2013-04-10 12:39:20 CEST
isn't this fixed already?

CC: (none) => alien

Comment 24 Pascal Terjan 2013-04-10 12:43:12 CEST
As far as I know, nothing has been done.
Comment 25 AL13N 2013-04-10 12:53:20 CEST
i thought we only used the xml and synthesis and not the hdlists themselves?
Comment 26 AL13N 2013-04-10 12:53:38 CEST
should we move this feature to mga4 then?
Comment 27 Anderson Carvalho 2013-04-10 12:54:56 CEST
The biggest problem is not the technique. The greatest difficulty is to awaken the enterece in who can do it...
Comment 28 Nicolas Vigier 2013-04-10 12:57:58 CEST
(In reply to AL13N from comment #25)
> i thought we only used the xml and synthesis and not the hdlists themselves?

Which is not the same as repomd.

(In reply to AL13N from comment #26)
> should we move this feature to mga4 then?

It is not clear whether we want this, and whether there is someone who wants to implement this.
Comment 29 Thierry Vignaud 2013-04-10 18:06:13 CEST
There's no such thing as standard metadata for rpm.
IMHO "standard rpm metadata" is a lie as it's not for rpm at all for the higher level package manager (yum, urpmi, ...), each one having its format.
If I were to change format, I would consider switching to libsolv format instead (and to use libsolv as resolver), not yum format.
Comment 30 AL13N 2013-04-10 19:41:19 CEST
ok, so better set to WONTFIX then until someone comes along and actually wants to do the work of changing format IF it's actually needed...
Comment 31 Thierry Vignaud 2013-04-10 21:33:54 CEST
Closing. Again.

Status: REOPENED => RESOLVED
Resolution: (none) => WONTFIX

Comment 32 AL13N 2013-04-10 22:11:55 CEST
rereading this, i think the stats speak for themselves. time to let the beast rest... forgood.
Comment 33 Anderson Carvalho 2013-04-10 22:39:28 CEST
Could you explain exactly what you mean?
Comment 34 AL13N 2013-04-11 13:26:17 CEST
it means that repomd is slower(x5) and bigger than what we have now...

it also means that format-chaning has effects on alot of parts of the distro...

it also means that the primary developer would rather want libsolv rather than repomd, because of his preferences.

iow: not gonna happen.
Comment 35 Anderson Carvalho 2013-04-11 13:41:57 CEST
The main problem is really urpmi and this rpm standardization would be useful to use other package managers. URPMI trying to be friendly often only creates problems. Eg: only show the latest version of the package in repository. Does not downgrade transparently. If you have installed community repositories URPMI hides official packages, etc ... Fortunately now we can use SMART on Mageia 3. I follow the distro since Mandrake 7 and urpmi has always been limited in some points.
Comment 36 AL13N 2013-04-11 13:53:57 CEST
perhaps it can be improved instead? (in another bug report)

iinm --oldpackage was ported to urpmi, i think i saw a change about this...

also the latest version, that's just rpmdrake filtering, one could make a patch to show all if one wanted it bad enough...

listen, we can't use repomd, because the whole buildsystem would slow down 5x more..., we could maybe only do repomd for releases, but then it would have bad testing and such...
Comment 37 Anderson Carvalho 2013-04-11 15:41:40 CEST
I understand. So I said the question of urpmi. It is more easier to improve urpmi than to modify the standard. I had already reported the support -- downgrade for urpmi and it has already been partially implemented. Now then is to report more bugs enhancement of urpmi.

https://bugs.mageia.org/show_bug.cgi?id=6655
Nicolas Vigier 2014-05-08 18:04:32 CEST

CC: boklm => (none)

Neal Gompa 2015-08-10 16:25:58 CEST

CC: (none) => ngompa13


Note You need to log in before you can comment on or make changes to this bug.