Fresh network install, select all package categories. Package selection finishes, installation panel comes up, gives a package count and about 5-7 detail lines. At that point, X shuts down and you're flipped back to tty1 with the system fully shut down (so no response on tty2 for "bug"). tty1 shows: warning: /etc/fstab created as /etc/fstab.rpmnew exited abnormally :-( received signal 4 (usual shutdown messages) Last thing in tty3 is: filesystem not installed, Generating 12 missing indexes, pkease wait... Last thing on tty4: Traps: runinstall2[325] trap invalid opcode ip:xxx sp:xxx error:0 in libpthread.so.0[xxx+17000] This has been happening for a week or so now (at least, because that was the first time I've tried a fresh install for a while now), but I wanted to see if the mass rebuild would fix it. Reproducible: Steps to Reproduce:
I confirm the problem. I have more or less the same messages, in particular filesystem not installed, Generating 12 missing indexes and exited abnormally :-( received signal 4 I tried several times since about ten days. I also thought that the problem came from Mass rebuild. But it seems to be another cause.
CC: (none) => paiiou
I can specify that I had this kind of problem from 16 sept. Until September 15th I had the message with systemd.
Frank, is this the same as Bug 14101 that you reported?
The packaging issue in bug #14101 are fixed. Let's focus on the crash here. What's your CPU? I saw this with Intel E8400 : traps: runinstall2[24495] trap invalid opcode ip:7ff1d351b192 sp:7fffca665b98 error:0 in libpthread.so.0[7ff1d350f000+17000]
CC: (none) => thierry.vignaud, tmbSource RPM: (none) => glibc
My cpu : Athlon XP 2500+
@Georges: what arch ? i586 or x86_64 ?
(In reply to Thomas Backlund from comment #6) > @Georges: > > what arch ? i586 or x86_64 ? i586
@Georges: and what trap message do you get ?
(In reply to Thomas Backlund from comment #8) > @Georges: and what trap message do you get ? Try a new install. Local mirror, synchonised this morning. Boot.iso : 6 oct Personalized desktop. Déselected all packages, then With X server, w/o suggest The installation seems to begin. Then black screen with: Warning /etc/fstab created as fstab.new exited abnormaly :-( --received signal 4 With AltF3 : filesystem not installed, generating 12 missing indexes
@tmb: I tried using --disable-lock-elision in glibc but it didn't help But...
Created attachment 5483 [details] GDB trace with symbols (Illegal instruction in ELIDE_UNLOCK) Trace got using: CLEAN=1 drakx/tools/drakx-in-chroot /mageia/unstable/x86_64/ /T --useless_thing_accepted --flang fr --keyboard fr --lang fr --gdb ("--useless_thing_accepted --flang fr --keyboard fr --lang fr" really are just to speed up testing) So it does crash due to elision code despite me having an E8400...
Priority: Normal => release_blockerAssignee: bugsquad => tmbSummary: Install failsimmediately during package install => Install crashes immediately during package install (SIGILL, Illegal instruction in __GI___pthread_rwlock_unlock() -> ELIDE_UNLOCK() )
Ah, thats interesting... I used the upstream fix to disable elision: http://svnweb.mageia.org/packages?view=revision&revision=731421 the dropping of "--enable-lock-elision" means the same as "--disable-lock-elision" So the disabling of elision is actually exposing another bug :/ And it was only supposed to trigger on Haswll level hw... I guess this means the elision stuff is not properly #ifdeffed when it gets disabled...
OK, so atleast the ELIDE_UNLOCK is not properly protected for disabled elision. I've reported it upstream and did a patch that ensures all adaptive elision callsites in rwlocks are not triggered in glibc-2.20-8.mga5 I'll rebuild stage2 when new glibc is available
I just did it.
Created attachment 5484 [details] GDB trace with symbols (Illegal instruction in ELIDE_UNLOCK) now it deadlocks
Crap. A new fix is building with a more minimal change approach. I now only touch ELIDE_UNLOCK path and add the same elide check as the other ELIDE_* defines use If this does not work either we can also rollback to older microcode and enable elision again for beta1 until upstream gets a proper fix
stage2 rebuilt with glibc-2.20-9
Created attachment 5485 [details] GDB trace with symbols (deadlock) With -9.mga5, it looks like it deadlocks
Hm, so it looks like upstream is right, it wiil be a pain disabling the elision as it's not really tested that way :/ I know they had to enable it back on atleast s390 for reasons like this... I'll nuke the haswell specific microcodes from the tarball and re-enable elision for now...
ok, microcode-0.20140913-2.mga5 dropped the problematic firmwares glibc-2.20-10.mga5 has lock elision enabled drakx-installer-* rebuilt with latest glibc & co lets hoe it will behave better
Created attachment 5486 [details] GDB trace with symbols (Illegal instruction in pthread_rwlock_unlock) Well, not better...
Hm, I'm starting to think this might be a rpm bug, most likely the rpmlog part exposed by new glibc If you compare backtrace with and without elision, it's exactly the same... And the last commit to rpmlog (Feb 19, 2013) isn't really assuring :) http://rpm.org/gitweb?p=rpm.git;a=commit;h=96e0cdf34b1d4b40d6565d396016f74446bd4b5f Maybe we should ask Panu for input
(In reply to Georges Eckenschwiller from comment #2) > I can specify that I had this kind of problem from 16 sept. > Until September 15th I had the message with systemd. What changed on September 15th or 16th ?
"Generating 12 missing indexes" is a normal message from rpmlib that it creates indexes in /var/lib/rpm. The issue with filesystem package will be dealt later once the crash issue is fixed. Let's focus on the crash issue here.
(In reply to Thierry Vignaud from comment #4) > The packaging issue in bug #14101 are fixed. > Let's focus on the crash here. > > What's your CPU? > I saw this with Intel E8400 : > traps: runinstall2[24495] trap invalid opcode ip:7ff1d351b192 > sp:7fffca665b98 error:0 in libpthread.so.0[7ff1d350f000+17000] Sorry to be late back to the party. This laptop is an Asus A54C, but the Asus site for the spec sheet doesn't work. It's fairly old, and although it is a 64-bit machine, it lacks the hardware characteristics for defining 64-bit VMs in VBox. I don't have anything on it I can boot at the moment to run HardDrake, but the sticker on the case says Intel Pentium inside.
Confirming now on a different machine with an Intel i5 M 560.
CC: (none) => doktor5000
CC: (none) => pterjan
thomas we are still crashing in ELIDE_UNLOCK() with illegal instruction...
I've reverted http://rpm.org/gitweb?p=rpm.git;a=commitdiff;h=96e0cdf3 it enables me to run the installer
(In reply to Thierry Vignaud from comment #27) > thomas we are still crashing in ELIDE_UNLOCK() with illegal instruction... Yeah, both with and without elision we die in thread lock management... I have not yet managed to figure out why/where... (In reply to Thierry Vignaud from comment #28) > I've reverted http://rpm.org/gitweb?p=rpm.git;a=commitdiff;h=96e0cdf3 > it enables me to run the installer Nice, so we atleast maybe can get beta1 out with that workaround in place I will try to get some time to figure this out.
For me, the problem is solved Thanks
Confirming the fix. I'll leave this open for the eventual rpm fix.
Thanks to Panu, we now have a real fix instead of just a workaround. Fixed in URPM-5.01
Status: NEW => RESOLVEDResolution: (none) => FIXEDSource RPM: glibc => glibc, rpm, perl-URPM
Source RPM: glibc, rpm, perl-URPM => glibc