Bug 23049

Summary: Celestia crashes as soon as it launches
Product: Mageia Reporter: Len Lawrence <tarazed25>
Component: RPM PackagesAssignee: Shlomi Fish <shlomif>
Status: RESOLVED WONTFIX QA Contact:
Severity: normal    
Priority: Normal CC: marja11
Version: 6   
Target Milestone: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Source RPM: celestia-1.6.1-18.mga6.src.rpm CVE:
Status comment:
Attachments: gdb output from celestia session

Description Len Lawrence 2018-05-18 10:49:55 CEST
Description of problem:
Invoking celestia from the menus or command-line causes the Celestia logo to be displayed momentarily before the crash.  On the command-line:
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
*** Cannot find the double-buffered visual.
*** Trying single-buffered visual.
*** No appropriate OpenGL-capable visual found.

There is nothing relevant in dmesg or journalctl that I can find.

Version-Release number of selected component (if applicable):
celestia-1.6.1-18

How reproducible:
This has been a continuous problem during the development of Mageia 6 so it happens every time.

Steps to Reproduce:
1. CLI $ celestia
2. Observe momentary appearance of logo
3. then failure message.
Comment 1 Len Lawrence 2018-05-18 11:01:55 CEST
It runs fine in Cauldron as far as launch anyway.  That is version 1.6.1-19, so I guess all we need is a backport.
Comment 2 Marja Van Waes 2018-05-19 08:49:39 CEST
Assigning to the registered maintainer.

Assignee: bugsquad => shlomif
CC: (none) => marja11

Comment 3 Shlomi Fish 2018-05-19 10:19:00 CEST
(In reply to Len Lawrence from comment #1)
> It runs fine in Cauldron as far as launch anyway.  That is version 1.6.1-19,
> so I guess all we need is a backport.

The problem is that the Cauldron version was updated to 1.7.0-pre git and had many other changes by joequant and others, so I am unwilling to backport it as is.
Comment 4 Shlomi Fish 2018-05-19 10:35:06 CEST
Running celestia from the cmd line in an mga 6 x86-64 updates-testing vbox VM works perfectly fine here.
Comment 5 Shlomi Fish 2018-05-19 10:48:25 CEST
Len : does it happen in a new user on icewm?

Status: NEW => ASSIGNED

Comment 6 Len Lawrence 2018-05-19 11:09:08 CEST
Re comments 4 & 5:
Thanks for the quick response Shlomi.
Had not tried vbox and shall try IceWM when I have time with as barebones an environment as possible.
Re comment 3:
Backporting is no big issue for me - lately I had only invoked it as a test of graphics in QA.  Other users can go virtual if necessary or switch to Cauldron or Mageia 7 when it arrives.

Later.
Comment 7 Len Lawrence 2018-05-19 11:33:29 CEST
Tried the IceWM new user test and celestia still crashes.
Comment 8 Shlomi Fish 2018-05-19 12:39:46 CEST
(In reply to Len Lawrence from comment #7)
> Tried the IceWM new user test and celestia still crashes.

I see. Can you get us a gdb stack trace? Also, which hardware and drivers are you using?
Comment 9 Len Lawrence 2018-05-20 01:50:32 CEST
This is the system the bug was reported on:

System:    Host: difda Kernel: 4.14.40-desktop-1.mga6 x86_64 (64 bit)
           Desktop: MATE 1.18.0  Distro: Mageia 6 mga6
CPU:       Quad core Intel Core i7-4790 (-HT-MCP-) speed/max: 3812/4000 MHz
Machine:   Device: desktop Mobo: MSI model: Z97-G43 (MS-7816) v: 3.0
           UEFI: American Megatrends v: V17.8 date: 12/24/2014
Graphics:  Card: NVIDIA GM204 [GeForce GTX 970]
           Display Server: Mageia X.org 119.5 drivers: nvidia,v4l
           Resolution: 3840x2160@30.00hz
           GLX Renderer: GeForce GTX 970/PCIe/SSE2
           GLX Version: 4.6.0 NVIDIA 390.59
RAM:       31.37 GB

gdb stack trace - hmm.  That means pulling the source and recompiling doesn't it?
I would have no idea how to specify the compile command.

mgarepo has delivered revision 1230563.

I am looking at celestia.spec and the man page for gcc.  -ggdb stands out.
Line 80 starts with:
%configure2_5x --with-gtk \

Is it sufficient to add -ggdb there?

Sorry to be such a pest.
Comment 10 Len Lawrence 2018-05-20 08:25:21 CEST
Continued experimenting and found that --g is all that is needed.  Added missing development packages for celestia and lua5.2 and rebuilt the application.  Installed the rpm and ran celestia under gdb.  This time no problems apparent.    celestia launched OK - selected Saturn, centred on it and hit Goto.  That worked fine.  Appending part of the gdb output.

So where do we go from here?
Comment 11 Len Lawrence 2018-05-20 08:32:13 CEST
Created attachment 10174 [details]
gdb output from celestia session

On exit there was a notice about missing debuginfo files:

[Inferior 1 (process 29269) exited normally]
Missing separate debuginfos, use: debuginfo-install gvfs-1.32.1-1.mga6.x86_64 lib64atk1.0_0-2.24.0-1.mga6.x86_64 lib64blkid1-2.28.2-2.1.mga6.x86_64 .........
Comment 12 Len Lawrence 2018-05-20 08:58:56 CEST
Rebuilt celestia without debugging, reinstalled and found that it launched without any problems in Mageia 6.

The only difference between this and the release version is the presence of the two development packages.  ??
Comment 13 Len Lawrence 2018-05-20 09:11:24 CEST
Back to square one; installed the official release and ran it under gdb.
Session extract follows:

Reading symbols from /home/lcl/celestia...(no debugging symbols found)...done.
(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/celestia 
Missing separate debuginfos, use: debuginfo-install glibc-2.22-28.mga6.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffe5c62700 (LWP 24725)]
[New Thread 0x7fffe5461700 (LWP 24726)]
[New Thread 0x7fffe4c60700 (LWP 24727)]
[New Thread 0x7fffe445f700 (LWP 24728)]
[New Thread 0x7fffe3c5e700 (LWP 24729)]
[New Thread 0x7fffe345d700 (LWP 24730)]
[New Thread 0x7fffe2c5c700 (LWP 24731)]
[New Thread 0x7fffe245b700 (LWP 24732)]
libGL error: No matching fbConfigs or visuals found
[Thread 0x7fffe245b700 (LWP 24732) exited]
[Thread 0x7fffe2c5c700 (LWP 24731) exited]
[Thread 0x7fffe345d700 (LWP 24730) exited]
[Thread 0x7fffe3c5e700 (LWP 24729) exited]
[Thread 0x7fffe445f700 (LWP 24728) exited]
[Thread 0x7fffe4c60700 (LWP 24727) exited]
[Thread 0x7fffe5461700 (LWP 24726) exited]
[Thread 0x7fffe5c62700 (LWP 24725) exited]
libGL error: failed to load driver: swrast
*** Cannot find the double-buffered visual.
*** Trying single-buffered visual.
*** No appropriate OpenGL-capable visual found.
[Inferior 1 (process 24720) exited with code 01]
Missing separate debuginfos, use: debuginfo-install lib64atk1.0_0-2.24.0-1.mga6.
Comment 14 Len Lawrence 2018-05-20 10:29:22 CEST
Re comment 12:

Ignore "The only difference between this ......."  That remark does not make sense.
Comment 15 Len Lawrence 2018-05-20 23:58:35 CEST
One last thought.  Could there be a mismatch in mesa support?  There have been some updated mesa packages in QA lately which I think I have installed and this would mean that the release version would differ from my local build if mesa is important.  Does swrast come from mesa?  There is something like a 12KB difference in size for the two RPMs.  This may require a test on another machine, pre mesa updates.
Comment 16 Shlomi Fish 2018-05-22 12:05:02 CEST
(In reply to Len Lawrence from comment #15)
> One last thought.  Could there be a mismatch in mesa support?  There have
> been some updated mesa packages in QA lately which I think I have installed
> and this would mean that the release version would differ from my local
> build if mesa is important.  Does swrast come from mesa?  There is something
> like a 12KB difference in size for the two RPMs.  This may require a test on
> another machine, pre mesa updates.

I am tempted to blame the gpu drivers here. Which ones are you using?
Comment 17 Len Lawrence 2018-05-22 20:04:09 CEST
How would I identify them?  nvidia for a start, but what else?
Comment 18 Shlomi Fish 2018-05-22 21:36:49 CEST
(In reply to Len Lawrence from comment #17)
> How would I identify them?  nvidia for a start, but what else?

see https://askubuntu.com/questions/23238/how-can-i-find-what-video-driver-is-in-use-on-my-system
Comment 19 Len Lawrence 2018-05-23 00:27:45 CEST
That comes up with nvidia, which we already knew about.  And the modules loaded are nvidia and nvidia_modeset.  The problem has persisted over a number of versions up to the current 390.59.

A rebuild on Mageia 6 with older mesa software may still be worth trying.  
$ urpmf /bin/celestia
    http://distrib-coffee.ipsl.jussieu.fr/pub/linux/Mageia/distrib/6/x86_64/media/core/release/media_info/20170714-192548-files.xml.lzma
celestia:/usr/bin/celestia                                                     
    http://distrib-coffee.ipsl.jussieu.fr/pub/linux/Mageia/distrib/6/i586/media/core/release/media_info/20170714-192023-files.xml.lzma
celestia:/usr/bin/celestia                                                     

Don't know if that is significant.  All I know is that the problem disappears with my recent rebuild.
Comment 20 Shlomi Fish 2018-05-23 09:31:11 CEST
(In reply to Len Lawrence from comment #19)
> That comes up with nvidia, which we already knew about.  And the modules
> loaded are nvidia and nvidia_modeset.  The problem has persisted over a
> number of versions up to the current 390.59.
> 
> A rebuild on Mageia 6 with older mesa software may still be worth trying.  
> $ urpmf /bin/celestia
>    
> http://distrib-coffee.ipsl.jussieu.fr/pub/linux/Mageia/distrib/6/x86_64/
> media/core/release/media_info/20170714-192548-files.xml.lzma
> celestia:/usr/bin/celestia                                                  
> 
>    
> http://distrib-coffee.ipsl.jussieu.fr/pub/linux/Mageia/distrib/6/i586/media/
> core/release/media_info/20170714-192023-files.xml.lzma
> celestia:/usr/bin/celestia                                                  
> 
> 
> Don't know if that is significant.  All I know is that the problem
> disappears with my recent rebuild.

Dear Len,

frankly your comments are hard to follow. Perhaps prepare a document on a paste site/etc. concentrating all the information you know instead of bits and fragments in your comments. Furthermore note that the problem may be to blame for the proprietary nvidia drivers that you are using - I nicknamed them "hang-vidia" - see https://www.youtube.com/watch?v=IVpOyKCNZYw .
Comment 21 Len Lawrence 2018-05-23 10:00:57 CEST
Thanks Shlomi.  I think I shall drop this given that I can  rebuild celestia if I need it on mga6 and don't have much time to spare because of QA work.  And I don't even know what a paste site is.

Apologies if I sounded incoherent - the truth is I have no idea how to investigate problems like these.  It is mostly shooting in the dark, trying to guess at causes.  You could well be right about the nvidia driver.

We should close this bug, status WONTFIX maybe.

Thanks for your attention.
Comment 22 Shlomi Fish 2018-05-23 10:15:10 CEST
(In reply to Len Lawrence from comment #21)
> Thanks Shlomi.  I think I shall drop this given that I can  rebuild celestia
> if I need it on mga6 and don't have much time to spare because of QA work. 
> And I don't even know what a paste site is.
> 

see http://paste.debian.net/ and https://en.wikipedia.org/wiki/Pastebin .

> Apologies if I sounded incoherent - the truth is I have no idea how to
> investigate problems like these.  It is mostly shooting in the dark, trying
> to guess at causes.  You could well be right about the nvidia driver.
> 
> We should close this bug, status WONTFIX maybe.
> 

OK, will do.

> Thanks for your attention.

Resolution: (none) => WONTFIX
Status: ASSIGNED => RESOLVED