Phosh often catches SIGSEGV when Librem 5 connected to external display

Hi,

I am currently trying to make Librem 5 work reliably with an external display. It appears that there are numerous problems which can make this feature somewhat unusable. These are some other problems I stumbled upon: Librem 5: Unexpected hard resets when on high load, How to preconfigure default settings for external display?.

This time I noticed that Phosh often crashes with SIGSEGV when an external display is connected. As this seems like (mostly) a software problem, I am posting it in this forum branch. Here are some logs taken with journalctl -S 18:00 -U 18:44 --grep 'phoc|phosh':

Jun 21 18:06:53 pureos kernel: etnaviv-gpu 38000000.gpu: offending task: phoc (/usr/bin/phoc -S -C /usr/share/phosh/phoc.ini -E bash -lc 'gnome-session --disable-acceleration-check --session=phosh --builtin')
Jun 21 18:09:09 pureos kernel: etnaviv-gpu 38000000.gpu: offending task: phoc (/usr/bin/phoc -S -C /usr/share/phosh/phoc.ini -E bash -lc 'gnome-session --disable-acceleration-check --session=phosh --builtin')
Jun 21 18:09:10 pureos kernel: etnaviv-gpu 38000000.gpu: offending task: phoc (/usr/bin/phoc -S -C /usr/share/phosh/phoc.ini -E bash -lc 'gnome-session --disable-acceleration-check --session=phosh --builtin')
Jun 21 18:09:10 pureos kernel: etnaviv-gpu 38000000.gpu: offending task: phoc (/usr/bin/phoc -S -C /usr/share/phosh/phoc.ini -E bash -lc 'gnome-session --disable-acceleration-check --session=phosh --builtin')
Jun 21 18:17:04 pureos kernel: etnaviv-gpu 38000000.gpu: offending task: phoc (/usr/bin/phoc -S -C /usr/share/phosh/phoc.ini -E bash -lc 'gnome-session --disable-acceleration-check --session=phosh --builtin')
Jun 21 18:17:05 pureos kernel: etnaviv-gpu 38000000.gpu: offending task: phoc (/usr/bin/phoc -S -C /usr/share/phosh/phoc.ini -E bash -lc 'gnome-session --disable-acceleration-check --session=phosh --builtin')
Jun 21 18:17:06 pureos kernel: etnaviv-gpu 38000000.gpu: offending task: phoc (/usr/bin/phoc -S -C /usr/share/phosh/phoc.ini -E bash -lc 'gnome-session --disable-acceleration-check --session=phosh --builtin')
Jun 21 18:17:06 pureos kernel: etnaviv-gpu 38000000.gpu: offending task: phoc (/usr/bin/phoc -S -C /usr/share/phosh/phoc.ini -E bash -lc 'gnome-session --disable-acceleration-check --session=phosh --builtin')
Jun 21 18:30:30 pureos kernel: etnaviv-gpu 38000000.gpu: offending task: phoc (/usr/bin/phoc -S -C /usr/share/phosh/phoc.ini -E bash -lc 'gnome-session --disable-acceleration-check --session=phosh --builtin')
Jun 21 18:38:57 pureos phoc[1024]: invalid unclassed pointer in cast to 'PhocOutput'
Jun 21 18:38:57 pureos phoc[1024]: invalid unclassed pointer in cast to 'PhocOutput'
Jun 21 18:38:57 pureos phoc[1024]: invalid unclassed pointer in cast to 'PhocOutput'
Jun 21 18:38:57 pureos phosh[1343]: phosh_monitor_is_configured: assertion 'PHOSH_IS_MONITOR (self)' failed
Jun 21 18:42:37 pureos phoc[1024]: invalid unclassed pointer in cast to 'PhocOutput'
Jun 21 18:42:37 pureos phoc[1024]: invalid unclassed pointer in cast to 'PhocOutput'
Jun 21 18:42:37 pureos phoc[1024]: invalid unclassed pointer in cast to 'PhocOutput'
Jun 21 18:42:39 pureos phosh[1343]: phosh_monitor_is_configured: assertion 'PHOSH_IS_MONITOR (self)' failed
Jun 21 18:42:49 pureos phoc[1024]: invalid unclassed pointer in cast to 'PhocOutput'
Jun 21 18:42:49 pureos phoc[1024]: invalid unclassed pointer in cast to 'PhocOutput'
Jun 21 18:42:49 pureos phoc[1024]: invalid unclassed pointer in cast to 'PhocOutput'
Jun 21 18:42:50 pureos phosh[1343]: phosh_monitor_is_configured: assertion 'PHOSH_IS_MONITOR (self)' failed
Jun 21 18:42:55 pureos systemd[1]: phosh.service: Main process exited, code=killed, status=11/SEGV
Jun 21 18:42:55 pureos systemd[1]: phosh.service: Failed with result 'signal'.
Jun 21 18:43:00 pureos systemd[1]: phosh.service: Scheduled restart job, restart counter is at 1.
Jun 21 18:43:00 pureos systemd[1]: Stopped Phosh, a shell for mobile phones.
Jun 21 18:43:01 pureos systemd[1]: Started Phosh, a shell for mobile phones.

It happens on PureOS Byzantium. Is there anyone else who experiences this? Is it a bug in Phosh which appears indiscriminately, or is it triggered by an unfortunate combination of hardware?

I have also tried to run memtester to check RAM, but it tells that everything is fine.

3 Likes

Is it a bug in Phosh which appears indiscriminately, or is it triggered by an unfortunate combination of hardware?

If there is a SIGSEGV crash then there is no excuse, that should not happen for any hardware. Please try to hold on to the case you have and note precisely how you can reproduce the crash, so that the bug(s) can be found and fixed.

I think one thing you could do to get more info about what is happening is to run phosh through the gdb debugger and make it print a stacktrace when the crash happens.

Not sure exactly how but I think it could be done wither by starting phosh via gdb from the commandline, or else by editing the script (or the.service file) from which the phosh executable is started.

Then, instead of only the line saying phosh.service: Main process exited, code=killed, status=11/SEGV you would get a whole stacktrace showing at which place in the source code the crash happened.

Pinging @guido.gunther in case he has time to help.

3 Likes

Well, my question was more about whether I am the only one suffering, but this is also valid. :slightly_smiling_face:

I have just tried to do systemctl restart phosh, but it did not start again resulting in a black screen with glowing backlight. So I would appreciate any help with manually starting phosh. Maybe there is a way to do it from tmux over ssh?

Also, is there a way to place somewhere an edited systemd unit file and making it not survive reboot? It would be a disaster if I make wrong phosh unit file and have absolutely no user interface after reboot.

1 Like

It would be better to check with the latest version of Phosh. In the case of Crimson you could try the backports.

2 Likes

I hesitate to flash my Librem 5 to Crimson at least until it is out of testing because this phone is my daily driver. Also, I don’t expect a stable release of Crimson at least in 3 month at best. Maybe the stable release will be next year, who knows. In any case, the current stable is Byzantium, and thus I feel like it is where the bug should be tracked down and fixed.

But I agree, testing it with newer versions of Phosh would be interesting.

3 Likes

I am not sure whether it is related, but I see various GUI glitches from time to time even without connecting external display. Here are some screenshots I managed to make. Yes, these are visible on screenshots and in Phosh’s window previews.


2 Likes

Some times by just installing Gnu gdb on system, fix SIGSEGV issue magically on many cases. you can try installing gnu gdb.

1 Like

Well, the gdb was already installed. :slightly_smiling_face:

Surprisingly, the issue was in a malfunctioning display… In two malfunctioning old displays I tried to connect the phone to, actually. :sweat_smile:

Interestingly, both displays worked fine with the same USB hub and other devices. But one of these displays (with “Manufactured in 2007” written on its back) have finally died, i. e. no image on screen while the display being detected as if working properly. I bought a replacement and to my surprise the new display is working fine with my phone while being connected through the USB hub.

I think, this old malfunctioning display served well as a fuzzing tool for phosh. :laughing:

By the way, the USB hub is “Hoco HB37 6-in-1 Multiport Adapter”.

2 Likes

make sense as the log showed a issue on Phoc-and-Etnaviv which causes phosh to crash.
Glad you fixed.

1 Like

Only if the bug actually gets fixed. :wink: Now that the old display has died, it may be difficult for a developer to reproduce the problem and hence I wonder whether it will get fixed. Time will tell.

2 Likes

Is it fixed in Byz?

1 Like