I am using Librem 5 for a long time. Since the beginning I experienced unexpected freezes and hard reboots when the phone was tasked with something hard (like accidentally opening unusable Firefox while doing something else). I initially attributed it to low amount of RAM and out-of-memory conditions. Or maybe a faulty MicroSD where I placed an encrypted swap in addition to ZRAM configured by default.
Some time ago a Linux kernel update arrived which dramatically improved performance and made these hard resets much more rare. As mentioned in this post, now the kernel “never to swap out a page that was touched in the last 1 second”. Maybe this nonstop swapping of RAM pages and unswapping them next split second was what triggered these resets.
Nevertheless, I recently still stumbled upon a way to reproducibly trigger hardware reset. I was moving Flatpak installation path to MicroSD card as described here. I started removing apps from “system” “installation” one by one and installing them to a new “installation” on a MicroSD. Each time I started flatpak install ... there was a hard reset after some work was done.
By the way, the same behavior was also before the kernel update and adding a new flatpak “installation”. But back then I did not pay attention because resets were all the time.
After a few such reboots I created a cgroup, set read and write limits on a block device mounted from MicroSD through the io.max mechanism and started flatpak install ... again inside this cgroup. I also temporarily disabled swap on the MicroSD. This time it was somewhat long, but if finally completed without hard resets.
Has anyone else experienced something like that? It is long past the warranty period, so I just want to know whether this is a universal problem with Librem 5 or maybe it is just a single device with faulty hardware.
Also, I am not sure whether hard resets are really linked to high IO on MicroSD. Maybe it is a high load on something else. Determining the exact cause will hopefully help me to find out what exactly should I limit to completely avoid unexpected hard resets.
And the last question: Does anyone know how to determine from which hardware batch is my Librem 5? I am still unsure.
Have you looked at the log files when there are hard resets?
When you “temporarily disabled swap on the microSD”, did you disable swap completely? Did you also disable ZRAM (e.g. “sudo swapoff -a” which would disable /dev/zram too ) ?
A lot of redpine_91x errors throughout the logs. There are 5663 such
errors in 6 and a half hours (approx. 14.5 such errors per minute).
20:30:37 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:37 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:46 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:46 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:46 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:47 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:47 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:48 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:48 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:54 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:54 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:57 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:57 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:59 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:30:59 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:31:00 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:31:08 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
20:31:08 pureos kernel: redpine_91x: Packet Dropped as Key ID not matched with both current and previous Key ID
There was also 9 instances of the following error:
22:21:46 pureos kernel: redpine_91x: Packet Dropped as RX PN is less than last received PN
There was 6 of boots which did not have expected pureos systemd-journald[472]: Journal stopped before them. 4 of them had
messages like this right before (or several seconds before) sudden reboot.
It looks like you’ve got 6GB of ZRAM. That’s probably (assuming you have a non-Liberty version) about 2 times actual RAM. If the stuff in RAM is not very compressible that can result in over-commitment which could lead to a panic. I’m assuming Purism set up the ZRAM. However, I will note that on most systems the default ZRAM is 1/2 the actual RAM total rather than 2 times the RAM total. [Edit: Are you sure you didn’t touch ZRAM? I should note that according to defaults: Enable zram using systemd-zram-generator (!312) · Merge requests · Librem5 / librem5-base · GitLab … when Purism enabled ZRAM it was also set to have ZRAM equal to 1/2 the actual RAM.]
Also: Are you sure you didn’t temporarily disable the ZRAM? You said that you “temporarily disabled swap on the MicroSD”. If you disabled swap with a “sudo swapoff -a” instead of directly specifying the device … that would have disabled ZRAM.
In regard to logs … I was mainly looking for kernel panics or OOM killer messages. It sounds like you didn’t encounter kernel panics in your most recent freezes. I will say that other than broken hardware (e.g. RAM) I haven’t had a kernel panic outside of OOM issues —> a kernel panic is something that should generally be tracked down.
My main speculation is that the way that flatpak deals with packages results in a problem with ZRAM over-commitment. flatpak packages are compressed and, as a result, will not compress further when in memory —> what I don’t know is whether flatpak will try to load a full compressed package into memory (I doubt it, but I don’t know the inner workings of ostree). This would result in having issues with ZRAM
on.
I’ve had very bad luck with the OOM killer. I’ve had it take out crucial systems even when there was plenty (e.g 25GB left) of swap left. And one other time, while it wasn’t a kernel panic, I had my system freeze where not even REISUB did anything.
Yes, at least they were not recorded into logs. There is a possibility that there was a kernel panic, but it did not have a chance to be recorded into log file on a filesystem.
Yes, I also think that swap on a SD card is unreliable. When I added it, the phone for me was mostly unusable because of low RAM which supposedly provoked unending swapping-unswapping cycle. It was an act of despair. The new kernel update helped a lot, so I’ll try to disable swap on SD for some time and see if there will be any new hard resets. Maybe there was a memory corruption before I disabled swap on SD .
Also, I have just tried another thing which previously always triggered hard resets after short time: connecting a display through hub. Of course, after disabling swap on SD card in /etc/fstab and rebooting. Now it is miraculously working as intended. It was something I was trying and was not able to do for years.