Bug 14587 - Urpmi-proxy should serve .noarch files from both arch cache trees to any architecture
Summary: Urpmi-proxy should serve .noarch files from both arch cache trees to any arch...
Status: NEW
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal enhancement
Target Milestone: ---
Assignee: AL13N
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-17 19:25 CET by Morgan Leijström
Modified: 2017-12-02 11:50 CET (History)
2 users (show)

See Also:
Source RPM: urpmi-proxy-0.4.0-4.mga5.src.rpm
CVE:
Status comment:


Attachments

Description Morgan Leijström 2014-11-17 19:25:48 CET
Problem: Several gigabytes unnecessary duplicates are fetched and cached

Scenario: clients are both i586 and x86_64
Then urpmi-proxy fetch and store .noarch files separately for both

So far it is OK

Now, if it could for any arch client *look for* already downloaded .noarch files in both the i586 and x86_64 repo urpmi-proxy repo cache trees, and serve from there - then bandwidth, time, ans storage can be saved :)



__Sidenotes:
.noarch files for i586 and x86_64 are identical.

I learned from [discuss] mailing list that the mirrors etc actually hard links the .noarch to save bandwidth and storage there, and it is recommended for a rsync based cache to do the same.

Using hard links I guess is probably not a good solution for urpmi-proxy, but instead og looking botj places it could have the file/link both places; it could make a hard linked copy of any fetched .norach in the repo for the other arch.  I guess it should then just create that path (folder tree for the hard linked copy) if it do not exist.  It will only use very little storage for the dirs and links, and whenever a first client of that other arch use urpmi-proxy that content is there for it.

__Manual workaround:
For now i use a very simple script to hard link all *.noarch files ; first I update all computers of one arch, run that script, then update the other arch clients.

Reproducible: 

Steps to Reproduce:
Comment 1 Morgan Leijström 2014-11-17 20:08:20 CET
Related:
Bug 14588 - Urpmi-proxy could clean out old file versions to save cache size

Assignee: bugsquad => alien

Comment 2 AL13N 2014-11-17 20:28:25 CET
hmm, not easy to do something that's flexible and doesn't bite you in the ass later on... and isn't urpmi-specific...

i could maybe try to replace i586 and x86_64 in full path names with each other to see if there's such a file... but, won't it just delay all downloads?

perhaps it's better to have a filesystem where you can use dedup to spare your size? http transport doesn't actually have the hardlink info...
Comment 3 Morgan Leijström 2014-11-18 06:00:40 CET
OK, have the default setting not to do this trick, and it is by default not urpmi-specific :)

Missing the hardlink info, we have to go on naming, that is: whenever a *.noarch* file is requested, check if it exist under given path plus under the possible alternate location where .i586/x86_64 in the given path is substituted  with x86_64/.i586

The delay to in those cases check two dirs instead of one is very marginal compared to downloading them.

Never tried, but I guess deduplication use much more CPU and possible RAM resources https://btrfs.wiki.kernel.org/index.php/Deduplication
Miles Reystor 2016-10-05 15:19:24 CEST

CC: (none) => writing.my.life4ever

Marja Van Waes 2016-10-20 18:08:19 CEST

CC: (none) => marja11

Marja Van Waes 2017-12-02 11:50:22 CET

CC: writing.my.life4ever => mageiatools


Note You need to log in before you can comment on or make changes to this bug.