Librem 14 sudden crash when unplugged

Well, looks like I’m at least not alone with this =)… Just received my shiny new L14 today and dipping my toes into having a full linux workhorse (fed up with Apple…and Windows is even worse), problem is, it already rebooted randomly twice in the 6 hours I’m having it…it is quite warm today (maybe around 30°C/86°F), but wasn’t doing anything that would even remotely create any load on the system…and sadly writing this from my mac (again…) as I’m now worried that the L14 reboots in the middle of me writing this…

I just did a PureBoot BIOS update to v18 (the latest one) due to the headphone jack issue, but the first crash happened on the version it was delivered with, second one with the new version, so that doesn’t seem to make a difference.

Hope this gets fixed soon, can’t really rely on a machine if it randomly turns off without any warning and I would most certainly lose work & data.

PS: Don’t worry, not giving up that quickly on the L14, but a non-working headphone jack and random crashes/shutdowns are probably not to be considered the “ideal” experience for people to get into Purism/Linux

PPS: That’s acutally interesting, had it charged to about 86% at first (before turning it on the first time), but the used it unplugged ever since, means yes, the crashes happened unplugged, so I’ll now keep it plugged in and will report back if it crashes again! Will also do the BIOS/EC version extraction @joao.azevedo suggested and report back

Update: Erm, actually unable to boot, for some reason the L14 BIOS/System or whatever clock the LibremKey & PureBoot use was reset to 0 (e.g. 1/1/1970) which then causes the key verification to fail, says something in the line of “invalid signature, key was created in the future” or something alike, of course, it was created today =), but system thinks its 1970 now… it will only boot into the recovery shell… will try to turn HOTP verification off…which works, so I can get into the system, but now without the proper checks, which kinda defeats the purpose…

You can use the date command to reset the clock to the correct time:

From the recovery shell do:

date -s '2021-07-13 16:53:00'

adjust day and hour minutes and second to your timezone

and then run the command:

hwclock -w

And then reboot

2 Likes

thx @joao.azevedo that fixed it, now able to boot with the librem key again!

Little Update, just had another crash after more than a day without any, although most of the time was spent ON the power brick, until maybe 1-2 hours ago when I unplugged it, then it crashes in a spectacular fashion, have a look:

Press power button for a few seconds, device turns off, then start again and hope nothing broke =)…so far it didn’t. When I plugged it back in (before restarting it, now knowing this is def. the problem) it started charging at 51%. Enviroment temp. def. not an issue today as it is cool.

So based on my experience so far I can confirm that this happens only when unplugged (like OP stated in the title)

@joao.azevedo I did run the purism_ectool, here’s the output:

board: purism/librem_14 version: 2021-05-25_3b5ef1e

Firmware version is: Version: PureBoot-Release-18

Here is what I found right before the crash in /var/log/syslog:

`Jul 25 00:42:24 user kernel: [173043.591395] audit: type=1326 audit(1627166544.181:6743): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:24 user kernel: [173043.591398] audit: type=1326 audit(1627166544.181:6744): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:44 user kernel: [173063.606651] binder_linux: 51921 RLIMIT_NICE not set
Jul 25 00:42:44 user kernel: [173063.606993] audit: type=1326 audit(1627166564.198:6745): auid=4294967295 uid=1000 gid=1000 ses=4294967295 subj==unconfined pid=50484 comm=“Thread-40” exe="/system/bin/app_process64" sig=0 arch=c000003e syscall=141 compat=0 ip=0x7f7ed85446a7 code=0x7ffc0000
Jul 25 00:42:44 user kernel: [173063.606995] audit: type=1326 audit(1627166564.198:6746): auid=4294967295 uid=1000 gid=1000 ses=4294967295 subj==unconfined pid=50484 comm=“Thread-39” exe="/system/bin/app_process64" sig=0 arch=c000003e syscall=141 compat=0 ip=0x7f7ed85446a7 code=0x7ffc0000
Jul 25 00:42:44 user kernel: [173063.607591] audit: type=1326 audit(1627166564.198:6747): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:44 user kernel: [173063.607593] audit: type=1326 audit(1627166564.198:6748): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:47 user kernel: [173066.607437] audit: type=1326 audit(1627166567.198:6749): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:42:47 user kernel: [173066.607466] audit: type=1326 audit(1627166567.198:6750): auid=4294967295 uid=100000 gid=100000 ses=4294967295 subj==unconfined pid=50409 comm=“netd” exe="/system/bin/netd" sig=0 arch=c000003e syscall=93 compat=0 ip=0x7fc01b217ac7 code=0x7ffc0000
Jul 25 00:43:04 user kernel: [173083.620268] binder_linux: 51921 RLIMIT_NICE not set
Jul 25 00:43:14 user kernel: [173093.632211] binder_linux: 51921 RLIMIT_NICE not set

—CRASH—

Jul 25 00:56:43 user kernel: [ 0.000000] Linux version 5.10.0-7-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.40-1 (2021-05-28)`

not sure if it is of any help… no idea what RLIMIT_NICE not set means =)

btw, I also had to use this command I found somewhere in the forums to make the laptop even charge, otherwise it would just pull power from the plug, but not charge (e.g. battery would stay at whatever % charge it was plugged in): echo '70' | sudo tee -a /sys/class/power_supply/BAT0/charge_control_start_threshold

I also see that there is a new EC version here: https://source.puri.sm/firmware/releases/-/commit/a27df1bb0220e33c918034a1dc0312c5f080e194#6d777e3978752766b2b98e999f4714dd6f8ab934

but how to update it, anywhere I can find instructions?

I do see that the ectool does have a flash function, but don’t want to brick anything, so help would be appreciated=):

SUBCOMMANDS: console fan flash flash_backup help Prints this message or the help of the given subcommand(s) info keymap led_color led_mode led_save led_value matrix print

EDIT: Well, went the danger-route and did what’s discussed here [librem 14] how to update EC firmware to ec-2021-06-04_ef9fd3c
Didn’t brick my L14 either, so I did simply sudo ./purism_ectool flash_backup ec-2021-06-04_ef9fd3c.rom

but maybe don’t try it at home, if not urgent, wait for official “release” and instructions, but I need to travel monday and can’t really rely on a machine that crashes when unplugged…so hoping that the EC update will help, don’t know yet, will report back!

but now sudo ./purism_ectool info shows

board: purism/librem_14 version: 2021-06-04_ef9fd3c

UPDATE: Unfortunately bad news, laptop still crashes after EC update, although I managed to run it down to 15% this time, not sure if the charge percentage is relevant though.

1 Like

that EC update does not address the crash when unplugging/crash when below 50% battery and under heavy load. The update is for that is forthcoming.

1 Like

try to switch from pureboot to coreboot.
i had other screen related issue. my screen was blinking from time to time.
quite intensive with compositing desktop env like plasma.
i switched to correboot. no issues in last 48 hours.

this issue is not related to EC but way how Pureboot initialises linux memory - it breaks something with intel graphics area…

both coreboot and Pureboot use the exact same code to init the display. I’ve never seen anything like that with Gnome or XFCE; haven’t used plasma.

and i posted this message with uptime 4 hours on battery :wink: and quite heavy laod so battery is almost out of juce…

not really.
actually linux kernel booted directly, slightly differs from kexeced
fist difference is intel_iommu=on in command-line,
kexec isn’t perfect and it skips some initialization process,
and can ignore/wrongly interpret memory map (Intel cards don’t have own memory - they stole it from ram)
MTRR registers also differ.
just try coreboot you will see,…

Thank you for the information. I am new to the Purism community.
Can you tell me, where this update get normally announced (via forum/email)?

Looking forward to get the update and get rid of this sudden crashes.

Thank you!

2 Likes

@MrChromebox
Aug 19 00:10:13 Librem14 kernel: [ 1.123935] DMAR: DRHD: handling fault status reg 3
Aug 19 00:10:13 Librem14 kernel: [ 1.123939] DMAR: [DMA Read] Request device [00:02.0] PASID ffffffff fault addr 22c4134000 [fault reason 06] PTE Read access is not set
Aug 19 00:10:13 Librem14 kernel: [ 1.123943] DMAR: DRHD: handling fault status reg 3
Aug 19 00:10:13 Librem14 kernel: [ 1.123945] DMAR: [DMA Read] Request device [00:02.0] PASID ffffffff fault addr 6a6c349000 [fault reason 06] PTE Read access is not set
Aug 19 00:10:13 Librem14 kernel: [ 1.140991] DMAR: DRHD: handling fault status reg 3
Aug 19 00:10:13 Librem14 kernel: [ 1.140993] DMAR: [DMA Read] Request device [00:02.0] PASID ffffffff fault addr 22c4134000 [fault reason 06] PTE Read access is not set
Aug 19 00:10:13 Librem14 kernel: [ 1.157737] DMAR: DRHD: handling fault status reg 3

Aug 19 00:10:13 Librem14 kernel: [ 1.459247] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080
Aug 19 00:10:13 Librem14 kernel: [ 1.459259] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080
Aug 19 00:10:13 Librem14 kernel: [ 1.459267] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080
Aug 19 00:10:13 Librem14 kernel: [ 1.459286] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080
Aug 19 00:10:13 Librem14 kernel: [ 1.459313] i915 0000:00:02.0: [drm] ERROR Fault errors on pipe A: 0x00000080

that is actually reason of crashes…
later you can find something like that
Aug 19 00:32:44 Librem14 kernel: [ 295.817882] i915 0000:00:02.0: [drm] ERROR CPU pipe A FIFO underrun

and later if you have luck
Aug 19 00:35:32 Librem14 kernel: [ 463.878542] flashrom[4014]: segfault at 0 ip 00005587180eba2b sp 00007ffe1be2df80 error 4 in flashrom[5587180d8000+1e000]
Aug 19 00:35:32 Librem14 kernel: [ 463.878561] Code: 0f b7 c6 c7 44 24 0c 00 00 00 00 89 44 24 20 8b 4c 24 20 39 4c 24 0c 0f 83 2c 02 00 00 48 8d 43 04 48 39 c5 0f 86 1f 02 00 00 <0f> b6 43 01 3c 03 76 08 48 01 d8 48 39 c5 77 0c 48 8d 35 3b 9b 01

there is a problem with i915 (integrated intel graphics) it does not have own memory, but stole from RAM
badly set DMAR can cause graphics memory area can be allocated in user space.
and user can write data there (or userland app) boom you have system crash.

thats all folks.
it’s a problem with linux kernel (probably kexec iommu or i915) not being properly passed to kexeced kernel.
so we have:
memory region overlap , that can cause memory corruption. (in best case in userspace - user program will lose data/crash, in worse case kernel will crash)

3 Likes

@NineX: Thx a lot for your thoroughly analysis. Hope we get fix soon.
My L14 crashes a lot when attached to the right USB-C… It would be nice if I can operate my L14 over USB-C (Charging, Display, etc.)

if you are using pureboot, reflash it with Coreboot/Seabios - standard bios.
issue should go away.
switching coreboot to seabios , back and forth is easy and safe operation (like bios upgrade).

1 Like

I finally got around to updating the EC using these instructions:

On first appearances this seems to have solved the issue, but I will update here if it happens again and under what conditions.

Thanks to the team at Purism for putting out EC updates with detailed instructions.

2 Likes

unfortunately this happens on my Librem 15v3 with Coreboot, so there’s not much I can do :man_shrugging:t4:

Well this is topic about l14 crashing, and l15v3 is slightly different hardware.

can you pleas open separate topic,
change kenrel commandlinne:
delete : quiet
add : verbose
change loglevel to 5
then wait to problem appear , then grab for me /var/log/kern.log , /var/log/messages /var/log/syslog
home .xsession-errors
please,

1 Like

Unfortunately the issue is still around :frowning: I thought things were fixed as the laptop was happily functioning below charge levels where it had previously crashed, but it now has the same previous behavior when reaching lower battery levels of around 20%.

I still have this issue as well on my Librem 14. Just observed a crash at around 40% charge. The laptop booted up just fine after the crash. Have not observed this issue while plugged in.

board: purism/librem_14
version: 2021-08-03_05d9990

@avieth
what kind of bios you have?
PureBoot/Seabios ?

CoreBoot/SeaBIOS, no modifications from the factory setup.