Bug 17072 - build system broken in new ways by ARM addition
Summary: build system broken in new ways by ARM addition
Status: RESOLVED FIXED
Alias: None
Product: Infrastructure
Classification: Unclassified
Component: BuildSystem (show other bugs)
Version: unspecified
Hardware: arm Linux
Priority: release_blocker normal
Target Milestone: Mageia 6
Assignee: Sysadmin Team
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-11-02 21:42 CET by David Walser
Modified: 2017-04-10 15:02 CEST (History)
7 users (show)

See Also:
Source RPM:
CVE:
Status comment: Most critical issues fixed, some may remain


Attachments

Description David Walser 2015-11-02 21:42:44 CET
New issues seen when trying to update the wireshark package.

First build went fine:
http://pkgsubmit.mageia.org/uploads/failure/cauldron/core/release/20151102200948.luigiwalser.valstar.10182

It just failed because the libmajor's changed.  So it failed on all three arches and once I fixed it, it should have let me resubmit.

It didn't, because it said:
Submission errors, aborting:
- wireshark-2.0.0-0.rc2.1.mga6:
 - Newer revisions already exists for cauldron in upload queue: [...]

showing after that the .src.rpm was the one it was complaining about.  When the build fails on all arches, it should not block the next build because of the source RPM.

So I bumped the release tag, and then an install_deps issue only on ARM blocked the whole build:
http://pkgsubmit.mageia.org/uploads/failure/cauldron/core/release/20151102202442.luigiwalser.valstar.26674

ARM should be treated as a "secondary arch" and builds only failing there should not block everything.  Once I added an ExclusiveArch tag to get around that for now, then I had to bump the release tag again.

Reproducible: 

Steps to Reproduce:
Comment 1 David Walser 2015-11-02 22:42:12 CET
I also see that failing builds are not automatically being terminated on all build nodes, consuming needed resources.
Comment 2 Thierry Vignaud 2015-11-04 08:42:46 CET
That last one is a very old bug.

CC: (none) => thierry.vignaud

Thierry Vignaud 2016-05-12 14:58:06 CEST

Hardware: i586 => arm

Comment 3 David Walser 2016-08-09 23:47:51 CEST
Of course, there's also the new issue of not being able to submit builds to nonfree/tainted for packages that also exist in core.  That needs to be fixed.  More importantly, we need to make sure that it doesn't impact building updates for Mageia 6.

Priority: Normal => High

David Walser 2016-09-07 17:08:57 CEST

Priority: High => release_blocker

David Walser 2016-09-08 15:15:50 CEST

Target Milestone: --- => Mageia 6

Samuel Verschelde 2016-09-12 16:29:47 CEST

Status comment: (none) => Need to make sure it doesn't affect building for updates_testing in Mageia 6

Comment 4 Rémi Verschelde 2016-10-17 11:59:28 CEST
The issues in comment 0 and comment 1 are not critical for the release IMO (not release blockers), but the one in comment 3 is, unless we're fine with having to push SRPMs with different versions for updates of core + tainted packages.
Comment 5 David Walser 2016-10-17 12:31:33 CEST
The comment 3 issue is why this is a release blocker, because that's not fine.
Comment 6 Rémi Verschelde 2016-11-23 21:13:30 CET
Ping. Pascal, Olivier, any idea?
Comment 7 Thomas Backlund 2016-11-23 21:15:43 CET
pterjan, blino... would you mind sorting this one out before we release mga6

CC: (none) => mageia, pterjan, tmb

Comment 8 Thomas Backlund 2016-11-26 15:45:06 CET
Seems to be fixed.
I just submitted drakx-installer-images to nonfree with same relase as in core.

Thanks pterjan

Resolution: (none) => FIXED
Status: NEW => RESOLVED

Comment 9 David GEIGER 2016-11-26 16:01:35 CET
Nop I think it isn't yet fixed!

drakx-installer-images can be pushed both on core and nonfree because of arch restriction in spec file: "ExclusiveArch: x86_64 %{ix86}"

CC: (none) => geiger.david68210
Status: RESOLVED => REOPENED
Resolution: FIXED => (none)

Comment 10 Thomas Backlund 2016-11-26 16:04:20 CET
yes, but I couldnt submit them before with same rel... 

anyway a new mesa is coming today or so, and then we will know
Comment 11 Olivier Blin 2016-11-27 23:31:52 CET
This seems to have been fixed by Pascal in this commit:
http://gitweb.mageia.org/software/build-system/iurt/commit/?id=2328f608ff92c8efad17fbc36e0bbcb50376ef21
(and a few others around this one)
Comment 12 Rémi Verschelde 2016-11-27 23:33:18 CET
There were several issues mentioned in comment 0, comment 1 and comment 3.

For clarity, maybe this one can be closed and a new issue could be opened about remaining issues, if any?
Rémi Verschelde 2016-11-27 23:33:36 CET

Status comment: Need to make sure it doesn't affect building for updates_testing in Mageia 6 => Most critical issues fixed, some may remain

Comment 13 Nicolas Lécureuil 2016-11-28 01:03:51 CET
from pascal mail, this should be fixed.

Please reopen if needed.

CC: (none) => mageia
Status: REOPENED => RESOLVED
Resolution: (none) => FIXED

Comment 14 Thomas Backlund 2016-11-28 21:17:54 CET
Unfortunately...

core/release package built/uploaded 1h ago...
(currently building on arm*

Trying to submit it to tainted gets:

error: Failed to upload svn://svn.mageia.org/svn/packages/cauldron/mesa:
Executing perl -I/usr/share/mga-youri-submit/lib /usr/share/mga-youri-submit/bin/youri-submit --config /etc/youri/submit-todo.conf --define user=tmb --define sid=a6755a68-381c-49eb-ab00-d1ab7fa9e3a8 --define section=tainted/release cauldron /var/lib/schedbot/repsys/srpms/@1070654:mesa-13.0.2-1.mga6.src.rpm (sudo_user tmb)
Initializing repository
Executing /usr/bin/rpmlint -f /usr/share/rpmlint/config /var/lib/schedbot/repsys/srpms/@1070654:mesa-13.0.2-1.mga6.src.rpm
Submission errors, aborting:
- mesa-13.0.2-1.mga6.src:
 - Newer revisions already exists for cauldron in upload queue: /var/lib/schedbot/uploads//todo/cauldron/core/release/20161128191716.tmb.duvel.4624_@1070654:mesa-13.0.2-1.mga6.src.rpm



Of course, this is sort of "correct" if we intend to have arm* as an officially supported platform in mga6 in wich case we need the builds to stay in sync...

So what do we want ?

Resolution: FIXED => (none)
Status: RESOLVED => REOPENED

Comment 15 David Walser 2016-11-28 21:20:25 CET
ARM was supposed to be a "secondary architecture" and is not supposed to interfere with the normal building operation on i586 and x86_64.  We can't have this issue interfering with building normal updates.

I actually think that for Mageia 6, ARM builds shouldn't be automatically activated at all, and periodically (maybe once a month) some script or something should ARM build the latest changed packages and push them all atomically, but that's up to you guys how you want to handle that.
Comment 16 Rémi Verschelde 2016-11-28 23:45:35 CET
Side question (does not answer the ARM topic):
Do we need to prevent pushing the same NEVR to two different sections at the same time while the first one is still in the todo list? Allowing to push concurrent builds on different sections would solve that issue automatically.

Of course we might waste some resources if packagers send broken builds on both core and tainted at the same time (e.g. a badly packaged mesa or ffmpeg), but how often would that happen? Or am I missing something that makes it necessary to prevent such concurrent builds?
Comment 17 Rémi Verschelde 2016-11-28 23:51:13 CET
(In reply to Thomas Backlund from comment #14)
> core/release package built/uploaded 1h ago...
> (currently building on arm*
> 
> Trying to submit it to tainted gets: [...]

Actually I still get the same submit error even now, when mesa has been properly uploaded for all 4 supported arches.
Comment 18 Pascal Terjan 2016-11-29 00:11:59 CET
(In reply to Thomas Backlund from comment #14)
> Unfortunately...
> 
> core/release package built/uploaded 1h ago...
> (currently building on arm*
> 
> Trying to submit it to tainted gets:
> 
> error: Failed to upload svn://svn.mageia.org/svn/packages/cauldron/mesa:
> Executing perl -I/usr/share/mga-youri-submit/lib
> /usr/share/mga-youri-submit/bin/youri-submit --config
> /etc/youri/submit-todo.conf --define user=tmb --define
> sid=a6755a68-381c-49eb-ab00-d1ab7fa9e3a8 --define section=tainted/release
> cauldron /var/lib/schedbot/repsys/srpms/@1070654:mesa-13.0.2-1.mga6.src.rpm
> (sudo_user tmb)
> Initializing repository
> Executing /usr/bin/rpmlint -f /usr/share/rpmlint/config
> /var/lib/schedbot/repsys/srpms/@1070654:mesa-13.0.2-1.mga6.src.rpm
> Submission errors, aborting:
> - mesa-13.0.2-1.mga6.src:
>  - Newer revisions already exists for cauldron in upload queue:
> /var/lib/schedbot/uploads//todo/cauldron/core/release/20161128191716.tmb.
> duvel.4624_@1070654:mesa-13.0.2-1.mga6.src.rpm
> 
>

Yes this is expected, the bug which should be fixed was that it would still fail after arm finished building (or failed) as the src.rpm remained in the todo queue, but it still remains while it gets built.

The proper fix would be anyway to finish supporting submitting to several sections at once (iurt support used to work, but I never started looking at adding support in mgarepo, and iurt may have been broken since).

> Of course, this is sort of "correct" if we intend to have arm* as an
> officially supported platform in mga6 in wich case we need the builds to
> stay in sync...
> 
> So what do we want ?

That is a question I asked on IRC last week, we probably don't want to fully support armv5tl due to low expected usage.

Supporting armv7hl would be nice but we probably don't want to delay a firefox security update because it takes more than a day to build on arm. Also, we can't ask QA to test extra architectures...
Comment 19 David Walser 2017-01-16 19:32:30 CET
Are there any outstanding issues here other than what I reported in Bug 20098?

See Also: (none) => https://bugs.mageia.org/show_bug.cgi?id=20098

Comment 20 Rémi Verschelde 2017-03-06 10:49:11 CET
So what's the status on this issue? The Mageia 6 release is getting near, so if there is something that we *must* fix before the release, it should be done now.

(In reply to Pascal Terjan from comment #18)
> (In reply to Thomas Backlund from comment #14)
> > Of course, this is sort of "correct" if we intend to have arm* as an
> > officially supported platform in mga6 in wich case we need the builds to
> > stay in sync...
> > 
> > So what do we want ?
> 
> That is a question I asked on IRC last week, we probably don't want to fully
> support armv5tl due to low expected usage.
> 
> Supporting armv7hl would be nice but we probably don't want to delay a
> firefox security update because it takes more than a day to build on arm.
> Also, we can't ask QA to test extra architectures...

This seems to be the most critical point to me in this issue. We can't have the slow ARM builds delay critical security updates.

Do we have a plan? Who will handle the necessary changes?
Comment 21 Rémi Verschelde 2017-04-04 10:18:46 CEST
So, I guess the plan is statu quo? Or do we want to change something before the release?
Comment 22 Nicolas Lécureuil 2017-04-04 10:31:57 CEST
i think we should just dedicate something like 2 arm machines for mga6 BS.
This will fix this issue for mga6 updates
Comment 23 Rémi Verschelde 2017-04-10 12:33:41 CEST
Alright, closing as the most pressing issues were fixed. If we do as suggested in comment 22, it should be working fine for Mageia 6.

Status: REOPENED => RESOLVED
Resolution: (none) => FIXED

Comment 24 David Walser 2017-04-10 15:02:19 CEST
We just need to make sure that packages that need to build on core and tainted don't get held up by ARM.

Note You need to log in before you can comment on or make changes to this bug.