Librem 14 sudden crash when unplugged

can you tell me your firmware version?

sudo dmidecode | grep -m 1 Version:

Version: PureBoot-Release-17.1

Thanks for the advice. I will check this log immediately after rebooting the next time this happens.

Incidentally, I’ve found the fans have calmed down a lot since I ended the tracker-extract search indexing process. However, this was only shown as using 8% of the CPU so the issue will probably return.

Can you also tell me the version of the EC firmware that you have:

Open a terminal in the same directory where you downloaded the files. Or move the files to the home folder and just open a terminal. And extract the files from the zipped folders you downloaded:

  • Run the terminal command: gzip -d purism_ectool.gz

Make the update tool purism_ectool executable:

  • Run the command: sudo chmod +x purism_ectool

sudo ./purism_ectool info

It should read: YYYY-MM-DD_commit-hash

Example:

version: 2021-06-04_ef9fd3c

It would seem it is an older version than the example you gave. Here is my output:

version: 2021-05-25_3b5ef1e

On the Librem 5 it looks like there is no /var/log/syslog file. I just posted a separate question about that:

I’ve had the inverse of this happen twice now - plugging in the USB-C charger to the USB-C right side port without the charger being plugged in to the wall resulted in an immediate crash. Running Pureboot 18 and EC version 2021-05-25_3b5ef1e.

Do you know how to get rid of this annoying stuff? I have the same problem, too.

2 Likes

Well, looks like I’m at least not alone with this =)… Just received my shiny new L14 today and dipping my toes into having a full linux workhorse (fed up with Apple…and Windows is even worse), problem is, it already rebooted randomly twice in the 6 hours I’m having it…it is quite warm today (maybe around 30°C/86°F), but wasn’t doing anything that would even remotely create any load on the system…and sadly writing this from my mac (again…) as I’m now worried that the L14 reboots in the middle of me writing this…

I just did a PureBoot BIOS update to v18 (the latest one) due to the headphone jack issue, but the first crash happened on the version it was delivered with, second one with the new version, so that doesn’t seem to make a difference.

Hope this gets fixed soon, can’t really rely on a machine if it randomly turns off without any warning and I would most certainly lose work & data.

PS: Don’t worry, not giving up that quickly on the L14, but a non-working headphone jack and random crashes/shutdowns are probably not to be considered the “ideal” experience for people to get into Purism/Linux

PPS: That’s acutally interesting, had it charged to about 86% at first (before turning it on the first time), but the used it unplugged ever since, means yes, the crashes happened unplugged, so I’ll now keep it plugged in and will report back if it crashes again! Will also do the BIOS/EC version extraction @joao.azevedo suggested and report back

Update: Erm, actually unable to boot, for some reason the L14 BIOS/System or whatever clock the LibremKey & PureBoot use was reset to 0 (e.g. 1/1/1970) which then causes the key verification to fail, says something in the line of “invalid signature, key was created in the future” or something alike, of course, it was created today =), but system thinks its 1970 now… it will only boot into the recovery shell… will try to turn HOTP verification off…which works, so I can get into the system, but now without the proper checks, which kinda defeats the purpose…

You can use the date command to reset the clock to the correct time:

From the recovery shell do:

date -s '2021-07-13 16:53:00'

adjust day and hour minutes and second to your timezone

and then run the command:

hwclock -w

And then reboot

2 Likes

thx @joao.azevedo that fixed it, now able to boot with the librem key again!

Little Update, just had another crash after more than a day without any, although most of the time was spent ON the power brick, until maybe 1-2 hours ago when I unplugged it, then it crashes in a spectacular fashion, have a look:

Press power button for a few seconds, device turns off, then start again and hope nothing broke =)…so far it didn’t. When I plugged it back in (before restarting it, now knowing this is def. the problem) it started charging at 51%. Enviroment temp. def. not an issue today as it is cool.

So based on my experience so far I can confirm that this happens only when unplugged (like OP stated in the title)

@joao.azevedo I did run the purism_ectool, here’s the output:

board: purism/librem_14 version: 2021-05-25_3b5ef1e

Firmware version is: Version: PureBoot-Release-18

Here is what I found right before the crash in /var/log/syslog:

`Jul 25 00:42:24 user kernel: [173043.591395] audit: type=1326 audit(1627166544.181:6743): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:24 user kernel: [173043.591398] audit: type=1326 audit(1627166544.181:6744): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:44 user kernel: [173063.606651] binder_linux: 51921 RLIMIT_NICE not set
Jul 25 00:42:44 user kernel: [173063.606993] audit: type=1326 audit(1627166564.198:6745): auid=4294967295 uid=1000 gid=1000 ses=4294967295 subj==unconfined pid=50484 comm=“Thread-40” exe="/system/bin/app_process64" sig=0 arch=c000003e syscall=141 compat=0 ip=0x7f7ed85446a7 code=0x7ffc0000
Jul 25 00:42:44 user kernel: [173063.606995] audit: type=1326 audit(1627166564.198:6746): auid=4294967295 uid=1000 gid=1000 ses=4294967295 subj==unconfined pid=50484 comm=“Thread-39” exe="/system/bin/app_process64" sig=0 arch=c000003e syscall=141 compat=0 ip=0x7f7ed85446a7 code=0x7ffc0000
Jul 25 00:42:44 user kernel: [173063.607591] audit: type=1326 audit(1627166564.198:6747): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:44 user kernel: [173063.607593] audit: type=1326 audit(1627166564.198:6748): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:47 user kernel: [173066.607437] audit: type=1326 audit(1627166567.198:6749): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:47 user kernel: [173066.607466] audit: type=1326 audit(1627166567.198:6750): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:43:04 user kernel: [173083.620268] binder_linux: 51921 RLIMIT_NICE not set
Jul 25 00:43:14 user kernel: [173093.632211] binder_linux: 51921 RLIMIT_NICE not set

—CRASH—

Jul 25 00:56:43 user kernel: [ 0.000000] Linux version 5.10.0-7-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.40-1 (2021-05-28)`

not sure if it is of any help… no idea what RLIMIT_NICE not set means =)

btw, I also had to use this command I found somewhere in the forums to make the laptop even charge, otherwise it would just pull power from the plug, but not charge (e.g. battery would stay at whatever % charge it was plugged in): echo '70' | sudo tee -a /sys/class/power_supply/BAT0/charge_control_start_threshold

I also see that there is a new EC version here: https://source.puri.sm/firmware/releases/-/commit/a27df1bb0220e33c918034a1dc0312c5f080e194#6d777e3978752766b2b98e999f4714dd6f8ab934

but how to update it, anywhere I can find instructions?

I do see that the ectool does have a flash function, but don’t want to brick anything, so help would be appreciated=):

SUBCOMMANDS: console fan flash flash_backup help Prints this message or the help of the given subcommand(s) info keymap led_color led_mode led_save led_value matrix print

EDIT: Well, went the danger-route and did what’s discussed here [librem 14] how to update EC firmware to ec-2021-06-04_ef9fd3c
Didn’t brick my L14 either, so I did simply sudo ./purism_ectool flash_backup ec-2021-06-04_ef9fd3c.rom

but maybe don’t try it at home, if not urgent, wait for official “release” and instructions, but I need to travel monday and can’t really rely on a machine that crashes when unplugged…so hoping that the EC update will help, don’t know yet, will report back!

but now sudo ./purism_ectool info shows

board: purism/librem_14 version: 2021-06-04_ef9fd3c

UPDATE: Unfortunately bad news, laptop still crashes after EC update, although I managed to run it down to 15% this time, not sure if the charge percentage is relevant though.

1 Like

that EC update does not address the crash when unplugging/crash when below 50% battery and under heavy load. The update is for that is forthcoming.

1 Like

try to switch from pureboot to coreboot.
i had other screen related issue. my screen was blinking from time to time.
quite intensive with compositing desktop env like plasma.
i switched to correboot. no issues in last 48 hours.

this issue is not related to EC but way how Pureboot initialises linux memory - it breaks something with intel graphics area…

both coreboot and Pureboot use the exact same code to init the display. I’ve never seen anything like that with Gnome or XFCE; haven’t used plasma.

and i posted this message with uptime 4 hours on battery :wink: and quite heavy laod so battery is almost out of juce…

not really.
actually linux kernel booted directly, slightly differs from kexeced
fist difference is intel_iommu=on in command-line,
kexec isn’t perfect and it skips some initialization process,
and can ignore/wrongly interpret memory map (Intel cards don’t have own memory - they stole it from ram)
MTRR registers also differ.
just try coreboot you will see,…

Thank you for the information. I am new to the Purism community.
Can you tell me, where this update get normally announced (via forum/email)?

Looking forward to get the update and get rid of this sudden crashes.

Thank you!

2 Likes

@MrChromebox
Aug 19 00:10:13 Librem14 kernel: [ 1.123935] DMAR: DRHD: handling fault status reg 3
Aug 19 00:10:13 Librem14 kernel: [ 1.123939] DMAR: [DMA Read] Request device [00:02.0] PASID ffffffff fault addr 22c4134000 [fault reason 06] PTE Read access is not set
Aug 19 00:10:13 Librem14 kernel: [ 1.123943] DMAR: DRHD: handling fault status reg 3
Aug 19 00:10:13 Librem14 kernel: [ 1.123945] DMAR: [DMA Read] Request device [00:02.0] PASID ffffffff fault addr 6a6c349000 [fault reason 06] PTE Read access is not set
Aug 19 00:10:13 Librem14 kernel: [ 1.140991] DMAR: DRHD: handling fault status reg 3
Aug 19 00:10:13 Librem14 kernel: [ 1.140993] DMAR: [DMA Read] Request device [00:02.0] PASID ffffffff fault addr 22c4134000 [fault reason 06] PTE Read access is not set
Aug 19 00:10:13 Librem14 kernel: [ 1.157737] DMAR: DRHD: handling fault status reg 3

Aug 19 00:10:13 Librem14 kernel: [ 1.459247] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080
Aug 19 00:10:13 Librem14 kernel: [ 1.459259] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080
Aug 19 00:10:13 Librem14 kernel: [ 1.459267] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080
Aug 19 00:10:13 Librem14 kernel: [ 1.459286] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080
Aug 19 00:10:13 Librem14 kernel: [ 1.459313] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080

that is actually reason of crashes…
later you can find something like that
Aug 19 00:32:44 Librem14 kernel: [ 295.817882] i915 0000:00:02.0: [drm] ERROR CPU pipe A FIFO underrun

and later if you have luck
Aug 19 00:35:32 Librem14 kernel: [ 463.878542] flashrom[4014]: segfault at 0 ip 00005587180eba2b sp 00007ffe1be2df80 error 4 in flashrom[5587180d8000+1e000]
Aug 19 00:35:32 Librem14 kernel: [ 463.878561] Code: 0f b7 c6 c7 44 24 0c 00 00 00 00 89 44 24 20 8b 4c 24 20 39 4c 24 0c 0f 83 2c 02 00 00 48 8d 43 04 48 39 c5 0f 86 1f 02 00 00 <0f> b6 43 01 3c 03 76 08 48 01 d8 48 39 c5 77 0c 48 8d 35 3b 9b 01

there is a problem with i915 (integrated intel graphics) it does not have own memory, but stole from RAM
badly set DMAR can cause graphics memory area can be allocated in user space.
and user can write data there (or userland app) boom you have system crash.

thats all folks.
it’s a problem with linux kernel (probably kexec iommu or i915) not being properly passed to kexeced kernel.
so we have:
memory region overlap , that can cause memory corruption. (in best case in userspace - user program will lose data/crash, in worse case kernel will crash)

3 Likes

@NineX: Thx a lot for your thoroughly analysis. Hope we get fix soon.
My L14 crashes a lot when attached to the right USB-C… It would be nice if I can operate my L14 over USB-C (Charging, Display, etc.)