Bug 19571 - overflow when querying for size with urpmq
Summary: overflow when querying for size with urpmq
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Thierry Vignaud
QA Contact:
URL:
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2016-10-12 14:42 CEST by Giuseppe Ghibò
Modified: 2016-10-19 14:12 CEST (History)
0 users

See Also:
Source RPM: perl-URPM-5.07-2.mga6.src.rpm
CVE:
Status comment:


Attachments
patch using size_t to avoid overflows in RPM querying size (1.37 KB, patch)
2016-10-12 14:43 CEST, Giuseppe Ghibò
Details | Diff
patch to avoid overflow when querying rpm size using unsigned int (916 bytes, patch)
2016-10-12 14:44 CEST, Giuseppe Ghibò
Details | Diff
fix reporting size on big packages (mga#19571) (1.35 KB, patch)
2016-10-13 13:09 CEST, Thierry Vignaud
Details | Diff
complete the patch to get the correct size (2.28 KB, patch)
2016-10-13 23:45 CEST, Giuseppe Ghibò
Details | Diff
attempt for getting the right package size (3.44 KB, patch)
2016-10-13 23:48 CEST, Giuseppe Ghibò
Details | Diff
attempt for getting the right package size (2) (3.67 KB, patch)
2016-10-13 23:51 CEST, Giuseppe Ghibò
Details | Diff
attempt for getting the full largefile support (4.47 KB, patch)
2016-10-13 23:57 CEST, Giuseppe Ghibò
Details | Diff
add support for querying >4gb package size (1.63 KB, patch)
2016-10-15 05:30 CEST, Thierry Vignaud
Details | Diff
64bit support for ->filesize() (1.06 KB, patch)
2016-10-15 05:31 CEST, Thierry Vignaud
Details | Diff
script for testing size (956 bytes, text/plain)
2016-10-15 22:17 CEST, Giuseppe Ghibò
Details
patch to show the right size up to 4GB (1.08 KB, patch)
2016-10-15 22:22 CEST, Giuseppe Ghibò
Details | Diff
patch for getting right size for files >= 4GB (3.13 KB, patch)
2016-10-15 22:33 CEST, Giuseppe Ghibò
Details | Diff
extra patch for the full 64bit support (1.77 KB, patch)
2016-10-15 22:43 CEST, Giuseppe Ghibò
Details | Diff
patch for the full largefile 64bit support using Math::Int64 (12.53 KB, patch)
2016-10-16 15:36 CEST, Giuseppe Ghibò
Details | Diff
patch for perl-URPM.spec (2.03 KB, text/x-rpm-spec)
2016-10-16 15:37 CEST, Giuseppe Ghibò
Details
patch for perl-URPM.spec (752 bytes, patch)
2016-10-16 15:38 CEST, Giuseppe Ghibò
Details | Diff
add support for int64, using Math::Int64 C API (9.02 KB, patch)
2016-10-18 03:48 CEST, Thierry Vignaud
Details | Diff
test (676 bytes, patch)
2016-10-18 03:48 CEST, Thierry Vignaud
Details | Diff
switch to uint64_t for sizes (2.68 KB, patch)
2016-10-18 03:48 CEST, Thierry Vignaud
Details | Diff
(get_int2) try new 64bit tag else old 32bit tag (1.81 KB, patch)
2016-10-18 03:48 CEST, Thierry Vignaud
Details | Diff

Description Giuseppe Ghibò 2016-10-12 14:42:49 CEST
NyB (and I) spotted an overflow when querying big package for size, for instance using this command:

urpmq -i kernel-desktop-4.8.1-1.mga6-debuginfo|grep Size

will give a negative result, e.g.

Size        : -695886512                   Architecture: x86_64

which is probably because it exceed the 2^31-1, which is the MAX_INT value for a signed 32bit integer.

Looking at the code we tracked down the function get_int which is used, and seems defined into perl-URPM.

We also find that RPM also allows query of long file using RPMTAG_LONGSIZE.

We tried to write two patches, either using the unsigned integer, instead of integer, as well as using the size_t (which should be printed insize sprintf()  using %zu) with the RPMTAG_LONGSIZE, but with the rebuilt patched perl-URPM, the same overflow still occurs, so probably there is still some code to fix, but that's a starting point.
Comment 1 Giuseppe Ghibò 2016-10-12 14:43:59 CEST
Created attachment 8522 [details]
patch using size_t to avoid overflows in RPM querying size
Comment 2 Giuseppe Ghibò 2016-10-12 14:44:49 CEST
Created attachment 8523 [details]
patch to avoid overflow when querying rpm size using unsigned int
Giuseppe Ghibò 2016-10-12 19:46:05 CEST

Assignee: bugsquad => thierry.vignaud

Thierry Vignaud 2016-10-13 02:56:02 CEST

Keywords: (none) => PATCH
Status: NEW => ASSIGNED

Comment 3 Thierry Vignaud 2016-10-13 13:09:06 CEST
Created attachment 8525 [details]
fix reporting size on big packages (mga#19571)

Those patch obviously haven't been tested as they don't work.
Also they introduced bad issues as size_t hasn't the same size on 32 & 64bit architectures.

The attached one does work.
But I'll do more testing before pushing it

Attachment 8522 is obsolete: 0 => 1
Attachment 8523 is obsolete: 0 => 1

Comment 4 Thierry Vignaud 2016-10-13 13:13:46 CEST
ah sorry I didn't see you said they were not working but were hints.
Comment 5 Mageia Robot 2016-10-13 17:59:28 CEST
commit 50b2b1b2be0ecb0bcdce376bb586f7f9db5b6592
Author: Thierry Vignaud <thierry.vignaud@...>
Date:   Thu Oct 13 13:05:32 2016 +0200

    fix reporting size on big packages (mga#19571)
---
 Commit Link:
   http://gitweb.mageia.org/software/rpm/perl-URPM/commit/?id=50b2b1b2be0ecb0bcdce376bb586f7f9db5b6592
Comment 6 Thierry Vignaud 2016-10-13 18:00:14 CEST
Fixed in git

Status: ASSIGNED => RESOLVED
Resolution: (none) => FIXED

Comment 7 Giuseppe Ghibò 2016-10-13 23:42:40 CEST
Well, I did the assist as "playmaker" and you scored as "forward player" :-)

But wait to close this bug, sound we caught the "goal post". The patch
you suggests is good but probably still incomplete. In fact with that patch and
the following test:

urpmq -i kernel-desktop-4.8.1-1.mga6-debuginfo

you still get a negative value for Size, which is not what expected. I
tried to fix with the following attach
(URPM-overflow-attempt3-works.patch), and in this case it works. But
IMHO is still incomplete too, as things seems much more complicate
than this. The patch I suggests work only in the interval 2GB <-> 4GB,
in fact with

   urpmq -i kernel-desktop-4.8.1-1.mga6-debuginfo

results:

Size        : 3599080784 

but IMHO still may fail with size bigger than 4GB. What even I'm not
sure is that if our rpm ever supports largefile, i.e. RPMs bigger than
4GB (either as total filesize or as compressed payload size) or what
should be the real beaviour with that kind of RPM. Here it says largefile
it's supported http://www.rpm.org/wiki/DevelDocs/LargeFiles, but I've a
doubt, as when querying for RPMTAG_LONGSIZE (or RPMTAG_LONGSIGSIZE) I
get some weird behaviour, i.e. a size of 0 bytes when the size is <
4GB, and the right size when the size is >= 4GB. So what should be
used? RPMTAG_SIZE or RPMTAG_LONGSIZE? or max(RPMTAG_SIZE, RPMTAG_LONGSIZE)?

Regarding using uint64_t instead of size_t you are right as size_t is
32bit on 32bit platforms and 64bit on 64bit platforms. However
according to the document about largefile "x.open.20Mar96.html" which
was the first standardization for LARGEFILE_SUPPORT it's adviced to
use off64_t if you want to be sure to get the same 64bit result on
both 32 and 64bit platform. Anyway nevermind as this are just
academics, and in the end "uint64_t" or "off64_t" would produce the
same "unsigned long long" 64bit integer on both platforms, which is
what is needed for having enough magnitude for storing size bigger
than 4GB.

What I tried, is to use "unsigned long" which on 64bit platform is
be a 64bit number, so not having problems for files bigger than
4GB. Of course it wouldn't work on 32bit, but once fixed on 64bit
architecture then you could extend to use off64_t (or
uint64_t) instead of "unsigned long". If it fails on 64bit
architecture too, it could be even more difficult to fix for 32bit arch.
Using uint64_t as RETVAL, causes some error, because it seems the perl
typemap conversion doesn't support "uint64_t" or "unsigned long long"
but just the basic typemap defined in /usr/lib/perl5/5.22.2/ExtUtils/typemap. 
However tt exists an extra RPM package called perl-Math-Int64 which
could be used for the purpose. I tried to borrow an extra typemap from
there, but then I don't know to modify properly the .pm and .xs to
glue/include the code (also the rpm package perl-Math-Int64 misses the
needed C includes which should be copied manually from the original
perl-Math-Inte64 tarball).

Anyway, actually I get the following behaviour using these tests:

#!/bin/sh
echo "remote:"
urpmq -i cross-gcc-debuginfo | grep -E 'Name|Size'
urpmq -i kernel-desktop-4.8.1-1.mga6-debuginfo | grep -E 'Name|Size'
urpmq -i libreoffice-debuginfo | grep -E 'Name|Size'

echo
echo "local:"
urpmq -i ./cross-gcc-debuginfo-6.0.0-0.2.mga6.x86_64.rpm | grep -E 'Name|Size' 
urpmq -i ./libreoffice-debuginfo-5.2.2.2-3.mga6.x86_64.rpm | grep -E 'Name|Size'

1) using patch URPM-overflow-attempt3-works.patch

remote:
Name        : cross-gcc-debuginfo
Size        : 0                            Architecture: x86_64
Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : 3599080784                   Architecture: x86_64
Name        : libreoffice-debuginfo
Size        : 2448192288                   Architecture: x86_64

local:
Name        : cross-gcc-debuginfo
Size        : 0                            Architecture: x86_64
Name        : libreoffice-debuginfo
Size        : 2448192288                   Architecture: x86_64

As you can see the size of cross-gcc-debuginfo is reported to be of
size 0 for both querying the package remote and locally, while indeed
the right size is 8151434666. The size instead of libreoffice-debuginfo and
kernel-desktop-debuginfo is right for both local and remote.

2) using patch URPM-overflow-attempt4-unsignedlong-rpmlongsize-doesntwork.patch

Here we get a different behaviour from remote package and local
package, and only the local query for cross-gcc-debuginfo works, while
fails for libreoffice-debuginfo.

Name        : cross-gcc-debuginfo
Size        : 0                            Architecture: x86_64
Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : 18446744073013665104         Architecture: x86_64
Name        : libreoffice-debuginfo
Size        : 18446744071862776608         Architecture: x86_64

local:
Name        : cross-gcc-debuginfo
Size        : 8151434666                   Architecture: x86_64
Name        : libreoffice-debuginfo
Size        : 0                            Architecture: x86_64


3) use patch URPM-overflow-attempt4-unsignedlong-rpmsize-doesntwork.patch
remote:

Name        : cross-gcc-debuginfo
Size        : 0                            Architecture: x86_64
Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : 18446744073013665104         Architecture: x86_64
Name        : libreoffice-debuginfo
Size        : 18446744071862776608         Architecture: x86_64

local:
Name        : cross-gcc-debuginfo
Size        : 0                            Architecture: x86_64
Name        : libreoffice-debuginfo
Size        : 2448192288                   Architecture: x86_64

here it gives weird numbers, while it's right only for local
libreoffice-debuginfo. Indeed the 18446744073013665104 shown is not
totally random, but corresponds to 2^64 - 18446744073013665104 =
695886512, which is the initial Size, without sign, shown in the initial
(unpatched) attempt, so probably there is still casting problems
somewhere.

What is also not clear is whether once modified the perl-URPM the
urpmi need to be recompiled accordingly too against latest perl-URPM package?

Status: RESOLVED => REOPENED
Resolution: FIXED => (none)

Comment 8 Giuseppe Ghibò 2016-10-13 23:45:05 CEST
Created attachment 8528 [details]
complete the patch to get the correct size

This patch complete yours, and works in the interval 2-4GB, fails beyond 4GB, but doesn't alter the structure size so probably is safe.
Comment 9 Giuseppe Ghibò 2016-10-13 23:48:09 CEST
Created attachment 8529 [details]
attempt for getting the right package size

This patch alters the size of the structure s_Package to make room for 64bit size numbers. This patch works only when the package is queried locally in the interval 2-4GB, but fails remotely, as well as locally for size > 4GB.

Consider as hint.
Comment 10 Giuseppe Ghibò 2016-10-13 23:51:11 CEST
Created attachment 8530 [details]
attempt for getting the right package size (2)

This patch is another attempt for getting the right package size. It query using RPMTAG_LONGSIZE instead of RPMTAG_SIZE. Such patch works only locally for size > 4GB, but fails locally for size between 2 and 4GB, as well remotely shows huge and rong 64bit numbers.

Consider as hint.
Comment 11 Giuseppe Ghibò 2016-10-13 23:57:36 CEST
Created attachment 8531 [details]
attempt for getting the full largefile support

This patch is an hint for getting the full largefile support, so all the filesize/offset works as 64bit. It doesn't work yet, as have the same problems of the previous patches which were using the "unsigned long". It's a suggestionuse the perl-Math-Int64 extra module, but still needs some glude code to be included. Probably it could be borrowed from such perl package only the part for 32 to 64bit integer conversion, without adding another dependency to an extra perl package.

Consider as an hint for getting the full largefile support (i.e. on both 64 and 32 bit platform) with RPM of size > 4GB (either as size or payload size).
Comment 12 Thierry Vignaud 2016-10-14 08:32:42 CEST
(In reply to Giuseppe Ghibò from comment #7)
> Well, I did the assist as "playmaker" and you scored as "forward player" :-)
> 
> But wait to close this bug, sound we caught the "goal post". The patch
> you suggests is good but probably still incomplete. In fact with that patch
> and
> the following test:
> 
> urpmq -i kernel-desktop-4.8.1-1.mga6-debuginfo
> 
> you still get a negative value for Size, which is not what expected.

Because you need to _regenerate_ the synthesis files...
Comment 13 Mageia Robot 2016-10-14 18:07:00 CEST
commit ba454a00703696b8fb2e01a6c3f9fcd0dd237fdb
Author: Thierry Vignaud <thierry.vignaud@...>
Date:   Thu Oct 13 13:05:32 2016 +0200

    fix reporting size on big packages (mga#19571)
---
 Commit Link:
   http://gitweb.mageia.org/software/rpm/perl-URPM/commit/?id=ba454a00703696b8fb2e01a6c3f9fcd0dd237fdb
Comment 14 Thierry Vignaud 2016-10-15 05:30:36 CEST
Created attachment 8537 [details]
add support for querying >4gb package size

This patch fixes:
- generating correct size in synthesis for both >4Gb & <4Gb packages (@info@ line).
- reporting the right size when querying pkg->size

It would need testing on 32bit, though, which could mean having to bind uint64_t in typemap or using https://metacpan.org/pod/Math::Int64#C-API

It doesn't fix filesize as:
- we don't have such big packages for now but it could bloat urpmi memory
  (struct Package is 8bytes bigger, but I'ven't checked for any SV change)
- filesize is smaller, depending on the actual compression ratio

But anyway, the original bug is fixed, thus this BR should be closed.
You'd better clone this one for 4Gb packages support
Comment 15 Thierry Vignaud 2016-10-15 05:31:32 CEST
Created attachment 8538 [details]
64bit support for ->filesize()

Probably the same issue on 32bit when an IV isn't enough and we would need to rely on https://metacpan.org/pod/Math::Int64#C-API
Comment 16 Giuseppe Ghibò 2016-10-15 10:37:57 CEST
OK, now I start understanding what the content of the file is synthesis.hdlist.cz, is the same as the output of "snprintf(buff, sizeof(buff), "%s.%s@%lu@%lu@%s" :-)

I'll post the result of some test, using current version of RPMs, though IMHO it would have a different behaviour when used locally directly in a package (as urpmq -i ./file.rpm) and remotely using the hdlist.

Also what I don't understand is why you don't use at least "unsigned int" instead of "int" as in the patch https://bugs.mageia.org/attachment.cgi?id=8528, for either the "filesize" struct member in the struct s_Package, as well as as the returned argument in the function Pkg_size(pkg) and Pkg_filesize(pkg)? Otherwise the returned argument will be always casted to a "signed integer" which actually is not right in the interval 2<->4GB. That fix would be effortless, as it won't require the struct s_Package to grow by 4 extra bytes. Also in the latest patch https://bugs.mageia.org/attachment.cgi?id=8538&action=diff which should add the full 64bit support, you changed the returned argument of pkg_filesize from int to long but not to "unsigned int" or to "unsigned long"? Is this on purpose? And in the same patch why you touched only the returned argument of function pkg_filesize() but not its twin pkg_size() which is still bind to "signed int" too?
Comment 17 Giuseppe Ghibò 2016-10-15 10:50:31 CEST
Ok, in last question the pkg_size was touched in patch 8537 :-) but remain the question of not using "unsigned long" instead of long.
Comment 18 Giuseppe Ghibò 2016-10-15 11:08:45 CEST
Regarding Math::Int seems it should be used when the RETVAL argument is 64bit, otherwise one can rely to unsigned long, which would produce different behaviour or 32bit and 64bit system, i.e. on 64bit would work without an extra typemap, but on 32bit it would cast to 32bit long (or unsigned long). I wonder if some single 64 function of it can be borrowed without including the whole package.
Comment 19 Giuseppe Ghibò 2016-10-15 22:17:21 CEST
Created attachment 8539 [details]
script for testing size

Script for testing the output on three big rpm files of the distro.
Comment 20 Giuseppe Ghibò 2016-10-15 22:20:43 CEST
I added a script for testing the size. If you use it with current:

urpmi-8.104-1.mga6
perl-URPM-5.08-2.mga6

under these package, which already includes your patch, I get the following output for either using the regenerated hdlists, as well as directly to rpm:

using synthesis
Name        : cross-gcc-debuginfo
Size        : 0                            Architecture: x86_64

Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : -695886512                   Architecture: x86_64

Name        : libreoffice-debuginfo
Size        : -1846775008                  Architecture: x86_64
---------------
using rpm
Name        : cross-gcc-debuginfo
Size        : 0                            Architecture: x86_64

Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : -695886512                   Architecture: x86_64

Name        : libreoffice-debuginfo
Size        : -1846775008                  Architecture: x86_64

as you can see the size is still not shown correctly.
Comment 21 Giuseppe Ghibò 2016-10-15 22:22:59 CEST
Created attachment 8541 [details]
patch to show the right size up to 4GB

This patch completes yours and works for sizes up to 4GB, but fails >= 4GB. The size of the structure s_Package is not altered.
Comment 22 Giuseppe Ghibò 2016-10-15 22:27:26 CEST
Using the patch https://bugs.mageia.org/attachment.cgi?id=8541&action=diff the output of the testurpmq.sh is this:

using synthesis
Name        : cross-gcc-debuginfo
Size        : 0                            Architecture: x86_64

Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : 3599080784                   Architecture: x86_64

Name        : libreoffice-debuginfo
Size        : 2448192288                   Architecture: x86_64
---------------
using rpm
Name        : cross-gcc-debuginfo
Size        : 0                            Architecture: x86_64

Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : 3599080784                   Architecture: x86_64

Name        : libreoffice-debuginfo
Size        : 2448192288                   Architecture: x86_64

as you can see the size is reported correctly for size up to 4GB, but for size >= 4GB, which in the example of the cross-gcc-debuginfo (which has a package size of ~1.5GB, but a payload uncompressed size of ~8GB), the size is reported to be 0.

I would include this patch too for now.
Comment 23 Giuseppe Ghibò 2016-10-15 22:33:07 CEST
Created attachment 8542 [details]
patch for getting right size for files >= 4GB

This patch is the complete patch for getting the right size for RPMs with payload size >= 4GB. It includes the way of querying the LONGSIZE you introduced in your previous attachment plus includes mine.

This patch works on 64bit systems, which report the right size, but probably fails on 32bit systems (I've not tested on a 32bit system), as there is no reason why it should work. Indeed on 32bit systems the output is right for interval 0-4GB, but probably is 0 for size >= 4GB.
Comment 24 Giuseppe Ghibò 2016-10-15 22:35:41 CEST
Using the new patch https://bugs.mageia.org/attachment.cgi?id=8542 would produce the following results with the testurpmq.sh:


using synthesis
Name        : cross-gcc-debuginfo
Size        : 8151434666                   Architecture: x86_64

Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : 3599080784                   Architecture: x86_64

Name        : libreoffice-debuginfo
Size        : 2448192288                   Architecture: x86_64
---------------
using rpm
Name        : cross-gcc-debuginfo
Size        : 8151434666                   Architecture: x86_64

Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : 3599080784                   Architecture: x86_64

Name        : libreoffice-debuginfo
Size        : 2448192288                   Architecture: x86_64

which is always the right size. The test probably would fail for cross-gcc-debuginfo on 32bit platform.
Thierry Vignaud 2016-10-15 22:43:22 CEST

Attachment 8539 mime type: application/x-shellscript => text/plain

Comment 25 Giuseppe Ghibò 2016-10-15 22:43:36 CEST
Created attachment 8543 [details]
extra patch for the full 64bit support

This patch should be added to the previous perl-URPM-overflow-uint64.patch to get the full largefile support on 32bit platform too.

It alters the size of the original struct s_Package, adding extra 4 bytes.

It also includes the changes to be added to the typemap.

The patch is still incomplete as requires to include the following include files to URPM.xs:

#include "ppport.h"
#include "math_int64.h"

as well as the lines with the module:

MODULE = Math::Int64::C_API::Sample   PACKAGE = Math::Int64::C_API::Sample

BOOT:
    PERL_MATH_INT64_LOAD;

which is not yet clear to me where exactly to add in the .xs file, as there are already a MODULE and a BOOT. Merge them?
Comment 26 Thierry Vignaud 2016-10-15 22:46:55 CEST
Comment on attachment 8539 [details]
script for testing size

For testing local big pkgs, I just used the following (which outputted the correct values for my 3 test packages):

$ perl -MURPM -e '$u=URPM->new;$u->parse_rpm($_) foreach(glob("4gb*.rpm")); foreach my $p (@{$u->{depslist}}) { warn $p->filesize, ", ", $p->size, "\n" }'
787118, 5368709120
768826, 5242880000
3296574582, 5242880000
Comment 27 Giuseppe Ghibò 2016-10-15 23:08:09 CEST
Are you sure you are using the same version of perl-URPM as of mine or rather a newer one with the other patches not yet published? 

As with your "magic" script, I'm getting the following output:

rpm -q perl-URPM urpmi
LC_ALL=C LC_NUMERIC=C perl -MURPM -e '$u=URPM->new;$u->parse_rpm($_) foreach(glob("*.rpm")); foreach my $p (@{$u->{depslist}}) { warn $p->filesize, ", ", $p->size, "\n" }'

perl-URPM-5.08-2.mga6
urpmi-8.104-1.mga6

1539207282, 0
476048598, -695886512
551572678, -1846775008

perl-URPM-5.08-2.mga6 and urpmi-8.104-1.mga6 are the version you upgraded yesterday upstream I downloaded from the repository. If I use the version with patch https://bugs.mageia.org/attachment.cgi?id=8541, which I called perl-URPMI-5.0-2.1mga6, I get:

perl-URPM-5.08-2.1.mga6
urpmi-8.104-1.mga6
1539207282, 0
476048598, 3599080784
551572678, 2448192288

and with the latest: https://bugs.mageia.org/attachment.cgi?id=8542 (perl-URPM-5.08-2.2.mga6), I get:

1539207282, 8151434666
476048598, 3599080784
551572678, 2448192288
Comment 28 Giuseppe Ghibò 2016-10-15 23:23:54 CEST
(In reply to Giuseppe Ghibò from comment #25)
> Created attachment 8543 [details]
> extra patch for the full 64bit support
> 
> This patch should be added to the previous perl-URPM-overflow-uint64.patch
> to get the full largefile support on 32bit platform too.
> 
> It alters the size of the original struct s_Package, adding extra 4 bytes.
> 
> It also includes the changes to be added to the typemap.
> 
> The patch is still incomplete as requires to include the following include
> files to URPM.xs:
> 
> #include "ppport.h"
> #include "math_int64.h"
> 

IMHO the current cauldron package perl-Math-Int64 should be modified to includes the C includes and linkable C libraries if needed.
Comment 29 Giuseppe Ghibò 2016-10-16 15:36:10 CEST
Created attachment 8547 [details]
patch for the full largefile 64bit support using Math::Int64

With this patch I tried to complete the previous attempt; it seems to work at least on 64bit.

The files perl_math_int64.{c,h} were borrowed from the examples of the perl-Math-Int64 package.

What I'm still not sure is whether there should be an extra include to 

use Math::Int64

int URPM.pm.
Comment 30 Giuseppe Ghibò 2016-10-16 15:37:07 CEST
Created attachment 8548 [details]
patch for perl-URPM.spec
Comment 31 Giuseppe Ghibò 2016-10-16 15:38:50 CEST
Created attachment 8549 [details]
patch for perl-URPM.spec

Attachment 8548 is obsolete: 0 => 1

Comment 32 Giuseppe Ghibò 2016-10-16 22:02:18 CEST
(In reply to Giuseppe Ghibò from comment #29)
> Created attachment 8547 [details]
> patch for the full largefile 64bit support using Math::Int64
> 
> With this patch I tried to complete the previous attempt; it seems to work
> at least on 64bit.
> 
> The files perl_math_int64.{c,h} were borrowed from the examples of the
> perl-Math-Int64 package.
> 
> What I'm still not sure is whether there should be an extra include to 
> 
> use Math::Int64
> 
> int URPM.pm.

This patch doesn't pass the tests while building on 32bit. Probably still something missed.
Comment 33 Thierry Vignaud 2016-10-18 03:48:14 CEST
Created attachment 8563 [details]
add support for int64, using Math::Int64 C API

I see we want similar pathes, though I didn't care about Devel::PPort

Attachment 8525 is obsolete: 0 => 1
Attachment 8528 is obsolete: 0 => 1
Attachment 8529 is obsolete: 0 => 1
Attachment 8530 is obsolete: 0 => 1
Attachment 8531 is obsolete: 0 => 1
Attachment 8537 is obsolete: 0 => 1
Attachment 8538 is obsolete: 0 => 1
Attachment 8541 is obsolete: 0 => 1
Attachment 8542 is obsolete: 0 => 1
Attachment 8543 is obsolete: 0 => 1
Attachment 8547 is obsolete: 0 => 1

Comment 34 Thierry Vignaud 2016-10-18 03:48:19 CEST
Created attachment 8564 [details]
test
Comment 35 Thierry Vignaud 2016-10-18 03:48:23 CEST
Created attachment 8565 [details]
switch to uint64_t for sizes
Comment 36 Thierry Vignaud 2016-10-18 03:48:36 CEST
Created attachment 8566 [details]
(get_int2) try new 64bit tag else old 32bit tag
Comment 37 Thierry Vignaud 2016-10-18 04:10:25 CEST
(In reply to Giuseppe Ghibò from comment #27)
> Are you sure you are using the same version of perl-URPM as of mine or
> rather a newer one with the other patches not yet published? 

I used the code I released.
Again, you really need to locally regenerate the synthesis with genhdlist2...

(In reply to Giuseppe Ghibò from comment #32)

> This patch doesn't pass the tests while building on 32bit. Probably still
> something missed.

Humm this works for my version...
i see the following differences with mine:
- you include the C file :-(
  (whereas I declared an extra object)
- I'm fallbacking to the native implementation on 64bit
  (but that shouldn't impact 32bit)
- I include intttypes.h
- you didn't updated the typemap
Comment 38 Thierry Vignaud 2016-10-18 04:12:36 CEST
another difference is that I updated MANIFEST
Comment 39 Giuseppe Ghibò 2016-10-18 07:46:59 CEST
(In reply to Thierry Vignaud from comment #37)

> (In reply to Giuseppe Ghibò from comment #27)
> > Are you sure you are using the same version of perl-URPM as of mine or
> > rather a newer one with the other patches not yet published? 
> 
> I used the code I released.
> Again, you really need to locally regenerate the synthesis with genhdlist2...
> 

The synthesis genhdlist2 were automatically regenereted with the testing script I'm using.
 
> (In reply to Giuseppe Ghibò from comment #32)
> 
> > This patch doesn't pass the tests while building on 32bit. Probably still
> > something missed.
> 
> Humm this works for my version...
> i see the following differences with mine:
> - you include the C file :-(
>   (whereas I declared an extra object)
> - I'm fallbacking to the native implementation on 64bit
>   (but that shouldn't impact 32bit)

yep, I commented the  MATH_INT64_NATIVE_IF_AVAILABLE to see whether I could trigger some error on 64bit too.

> - I include intttypes.h

ah!

> - you didn't updated the typemap

Well, I updated the typemap https://bugs.mageia.org/attachment.cgi?id=8547 only for uint64_t but not for int64_t since there were no plan to use any int64_t but only uint64_t in the code.

Anyway let's try with your all patches coalesced together...
Comment 40 Giuseppe Ghibò 2016-10-18 15:57:15 CEST
I tried the 4 patches, and compiles on 64bit but fails tests on 32bit, so probably still something missed.

Test Summary Report
-------------------
t/00prepare.t      (Wstat: 256 Tests: 2 Failed: 1)
  Failed test:  1
  Non-zero exit status: 1
t/01setter-getter.t (Wstat: 512 Tests: 3 Failed: 2)
  Failed tests:  2-3
  Non-zero exit status: 2
  Parse errors: Bad plan.  You planned 64 tests but ran 3.
t/parse.t          (Wstat: 65280 Tests: 1 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 41 tests but ran 1.
Files=8, Tests=10129,  3 wallclock secs ( 0.67 usr  0.03 sys +  2.39 cusr  0.11 csys =  3.20 CPU)
Result: FAIL
Failed 3/8 test programs. 3/10129 subtests failed.
Makefile:1023: recipe for target 'test_dynamic' failed
make: *** [test_dynamic] Error 255

I wonder whether it's just some missed casting upgrading 32/64 bit in the code or rather some missed library for Math::Int64, that in 64bit package is not triggered.
Comment 41 Thierry Vignaud 2016-10-18 16:24:29 CEST
It works for me, both in 32 & 64bits.
Are you got Math::Int64 installed?
Comment 42 Giuseppe Ghibò 2016-10-18 16:26:41 CEST
I built with mock the i586 version, Math::Int64 should be installed since I added a BuildRequires: perl-Math-Int64.
Comment 43 Giuseppe Ghibò 2016-10-18 16:31:33 CEST
While for the 64bit it works for me, as I get the following results with the test above:

Name        : cross-gcc-debuginfo                                              
Size        : 8151434666                   Architecture: x86_64

Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : 3599080784                   Architecture: x86_64

Name        : libreoffice-debuginfo
Size        : 2448192288                   Architecture: x86_64
---------------
using rpm
Name        : cross-gcc-debuginfo
Size        : 8151434666                   Architecture: x86_64

Name        : kernel-desktop-4.8.1-1.mga6-debuginfo
Size        : 3599080784                   Architecture: x86_64

Name        : libreoffice-debuginfo
Size        : 2448192288                   Architecture: x86_64

using perl -MURPM -e...
1539207282, 8151434666
476048598, 3599080784
551572678, 2448192288
Comment 44 Giuseppe Ghibò 2016-10-18 17:01:56 CEST
If for you builds the i586, try commit, and we see whether the i586 would build in the BS.
Comment 45 Thierry Vignaud 2016-10-18 17:36:05 CEST
As I told you I already done it locally and it built smoothly (both local build & local iurt, both 32 & 64 bit).
Comment 46 Giuseppe Ghibò 2016-10-18 17:38:45 CEST
ok, then commit.
Comment 47 Mageia Robot 2016-10-18 18:04:25 CEST
commit 64e3770177e8368a555654dee602d26b4c1afdbe
Author: Thierry Vignaud <thierry.vignaud@...>
Date:   Tue Oct 18 03:40:50 2016 +0200

    switch to uint64_t for sizes (mga#19571)
    
    Thus we use 64bit for package size on 32bit too, thanks to Math::Int64
---
 Commit Link:
   http://gitweb.mageia.org/software/rpm/perl-URPM/commit/?id=64e3770177e8368a555654dee602d26b4c1afdbe
Comment 48 Mageia Robot 2016-10-18 18:04:27 CEST
commit dab9efd9109587ef9b130ca1de05ae8533fa7928
Author: Thierry Vignaud <thierry.vignaud@...>
Date:   Tue Oct 18 03:42:32 2016 +0200

    (get_int2) try new 64bit tag else old 32bit tag
    
    thus enabling to report size of >4Gb packages (however insane this is):
    rpmlib uses the old small tag for small packages and the new big tag for
    big packages (mga#19571)
---
 Commit Link:
   http://gitweb.mageia.org/software/rpm/perl-URPM/commit/?id=dab9efd9109587ef9b130ca1de05ae8533fa7928
Comment 49 Thierry Vignaud 2016-10-18 18:05:17 CEST
losing

Status: REOPENED => RESOLVED
Resolution: (none) => FIXED

Comment 50 Thierry Vignaud 2016-10-18 18:23:10 CEST
Humm... in which branch is my "using PRIu64 for printf()" is hiding???
Comment 51 Giuseppe Ghibò 2016-10-18 21:06:31 CEST
In the end it come out a good team work and the result is polished.

BTW, usage of PRIu64 doesn't require -std=c99 in CFLAGS?
Comment 52 Thierry Vignaud 2016-10-18 21:56:35 CEST
Yeah, thx.

Afor PRIu64
You only need to include inttypes.h, ie to have a compiler than can conform to C99.
And gcc-5 defaults to C11 anyway...
Comment 53 Giuseppe Ghibò 2016-10-18 22:37:17 CEST
Just in the case perl-URPM is backported (by someone) to something older... ;-)
Comment 54 Thierry Vignaud 2016-10-19 07:39:55 CEST
PRIu64 is defined in inttypes.h by glibc for years...
And URPM is not something people backports.
Anyway it would just work :-)
Comment 55 Giuseppe Ghibò 2016-10-19 13:28:21 CEST
Great, yes PRIu64 works, thanks.

Though I wonder what is the real way to use "platform independent and portable and standard" support for largefile for both 32 and 64bit. Some kind of standard advice is to use: off_t (and not off64_t); off_t will be automatically typedefined to off64_t, according to the definition of the macro _FILE_OFFSET_BITS, which should be defined at compilation time to 32 or 64 (i,e, gcc -D_FILE_OFFSET_BITS=64 for instance) even on 32bit platforms.

Indeed off_t is a "signed integer" and not "unsigned", as it could be used not only for filesizes but also for file offsets which might be negative. In other words off_t (or the promoted off64_t) coincides with int64_t (and not with uint64_t) and thus when _FILE_OFFSET_BITS=64, the type off_t will have, of course, a sizeof=8 even on 32bit platforms.

Now the problem is how to print it portable in (s)printf()? Now beside %llu and PRIu64 seems there was defined another format specifier: "%j" (see http://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html). Also the re is the %z as the format specifier for "size_t" (which is unsigned); one might test with this little example:

test.c
// test the matrix of format specifiers:
//
// compile with:
//    gcc -Wformat -o test -D_FILE_OFFSET_BITS=64 test.c or
//    linux32 gcc -Wformat -o test -D_FILE_OFFSET_BITS=64 test.c or
//    clang -o test -D_FILE_OFFSET_BITS=64 test.c
#include <stdio.h>
#include <fcntl.h>
#include <features.h>
#include <inttypes.h>

int main()
{
    off_t    x = 1000000000000;
    size_t   y = 4294967295;
    uint64_t z = 2000000000000;

    printf("size(size_t)     = %d\n", sizeof(size_t));    
    printf("size(off_t)      = %d\n", sizeof(off_t));
    printf("sizeof(uint64_t) = %d\n", sizeof(uint64_t));

    printf("file offset bits = %d\n", _FILE_OFFSET_BITS);

    printf("off_t    val = %jd %zd %ld %lld %" PRIi64 " \n", x, x, x, x, x);
    printf("size_t   val = %ju %zu %lu %llu %" PRIu64 " \n", y, y, y, y, y);
    printf("uint64_t val = %ju %zu %lu %llu %" PRIu64 " \n", z, z, z, z, z);
}
   
So %ju and PRIu64 are the right ones for uint64_t and as you said it worked also before the compiler compliance with C99 thanks to inttypes.h inclusion.
Probably %ju (or their signed counterpart %jd) is a little bit more readable than having a string interrupted with a macro in the middle like PRIu64.

What I'm also see here of curious is that (clang shows it) the returned type of "sizeof(something)" which I always assumed that it was "int" since the time of Kerninhan-Ritchie C, so %d, has been promoted to size_t (so unsigned long) and thus we should use %zu when printing the result of a sizeof() is a standard-compliant way.

Anyway nevermind. PRIu64 works fine. The information in this bug could be very useful for some other bug facing the same problems.
Comment 56 Giuseppe Ghibò 2016-10-19 14:12:49 CEST
s/Kerninhan/Kernighan/ (can't be mispelled) :-)

Note You need to log in before you can comment on or make changes to this bug.