Librem 14 sudden crash when unplugged

Same modules, bundled.

i don’t wish to jump to conclusion, but maybe actually memory modules are factor here?
if they draw to much power, they can became unstable (it’s a guess, since i got totally different brand memory, and i can’t tigger other crash, than laptop just switch off around 6% of battery…)
It’s a guess,
only way to prove it is to actually swap modules to different brand and observe hardware behavior…
i am out of ideas now.

or it’s combinatin of overall power being drown from board by ram+ssd.
i have smaller and slowwer ssd …
so my powerconsumption is less.

1 Like
Handle 0x0009, DMI type 16, 23 bytes
Physical Memory Array
	Location: System Board Or Motherboard
	Use: System Memory
	Error Correction Type: None
	Maximum Capacity: 64 GB
	Error Information Handle: Not Provided
	Number Of Devices: 2

Handle 0x000A, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x0009
	Error Information Handle: Not Provided
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 16 GB
	Form Factor: SODIMM
	Set: None
	Locator: Channel-0-DIMM-0
	Bank Locator: BANK 0
	Type: DDR4
	Type Detail: Unknown Synchronous
	Speed: 2133 MT/s
	Manufacturer: Corsair
	Serial Number: 00000000
	Asset Tag: Channel-0-DIMM-0-AssetTag
	Part Number: CMSO16GX4M1A2133C15
	Rank: 2
	Configured Memory Speed: 2133 MT/s
	Minimum Voltage: 1.2 V
	Maximum Voltage: 1.2 V
	Configured Voltage: 1.2 V

@drs 2133 MT/s slower bus …
not big difference, however @pini can you show your dmidecode?


mine

Handle 0x000A, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x0009
	Error Information Handle: Not Provided
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: SODIMM
	Set: None
	Locator: Channel-0-DIMM-0
	Bank Locator: BANK 0
	Type: DDR4
	Type Detail: Unknown Synchronous
	Speed: 2667 MT/s
	Manufacturer: Samsung
	Serial Number: 018801e0
	Asset Tag: Channel-0-DIMM-0-AssetTag
	Part Number: M471A4G43MB1-CTD
	Rank: 2
	Configured Memory Speed: 2667 MT/s
	Minimum Voltage: 1.2 V
	Maximum Voltage: 1.2 V
	Configured Voltage: 1.2 V

Handle 0x000B, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x0009
	Error Information Handle: Not Provided
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: SODIMM
	Set: None
	Locator: Channel-1-DIMM-0
	Bank Locator: BANK 0
	Type: DDR4
	Type Detail: Unknown Synchronous
	Speed: 2667 MT/s
	Manufacturer: Samsung
	Serial Number: 01982098
	Asset Tag: Channel-1-DIMM-0-AssetTag
	Part Number: M471A4G43MB1-CTD
	Rank: 2
	Configured Memory Speed: 2667 MT/s
	Minimum Voltage: 1.2 V
	Maximum Voltage: 1.2 V
	Configured Voltage: 1.2 V

Here it is (one RAM stick ATM):

Handle 0x000A, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x0009
	Error Information Handle: Not Provided
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 16 GB
	Form Factor: SODIMM
	Set: None
	Locator: Channel-1-DIMM-0
	Bank Locator: BANK 0
	Type: DDR4
	Type Detail: Unknown Synchronous
	Speed: 2667 MT/s
	Manufacturer: Crucial
	Serial Number: e5e0a9f4
	Asset Tag: Channel-1-DIMM-0-AssetTag
	Part Number: CT16G4SFRA266.M16FRS
	Rank: 2
	Configured Memory Speed: 2667 MT/s
	Minimum Voltage: 1.2 V
	Maximum Voltage: 1.2 V
	Configured Voltage: 1.2 V

i am searching because i remember @nicole.faerber posted once, how to switch cpu to lower Watt mode, it will decarase powerconsumption, as it lowers some frequencies on board.
when i find it i will post it, i will as one of affected users, to switch his cpu to lower power mode (put power Cap) and test stability of the system.
i know it’s not an solution, but i am interested if instabilities are not connected with power regulator. (there are many of them on board.)

everyone affected may you try

echo 10000000 > /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw
echo 15000000 > /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_1_power_limit_uw

and then do your test run???
that will reduce powerusage on cpu -5W
defaults are

echo 15000000 > /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw
echo 20000000 > /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_1_power_limit_uw

I don’t know if that helps, but here are the specs of my RAM setup 2x32GB.

dmidecode 3.3

Getting SMBIOS data from sysfs.
SMBIOS 3.0 present.

Handle 0x0009, DMI type 16, 23 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: None
Maximum Capacity: 64 GB
Error Information Handle: Not Provided
Number Of Devices: 2

Handle 0x000A, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0009
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 32 GB
Form Factor: SODIMM
Set: None
Locator: Channel-0-DIMM-0
Bank Locator: BANK 0
Type: DDR4
Type Detail: Unknown Synchronous
Speed: 2667 MT/s
Manufacturer: Samsung
Serial Number: 019820d8
Asset Tag: Channel-0-DIMM-0-AssetTag
Part Number: M471A4G43MB1-CTD
Rank: 2
Configured Memory Speed: 2667 MT/s
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V

Handle 0x000B, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0009
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 32 GB
Form Factor: SODIMM
Set: None
Locator: Channel-1-DIMM-0
Bank Locator: BANK 0
Type: DDR4
Type Detail: Unknown Synchronous
Speed: 2667 MT/s
Manufacturer: Samsung
Serial Number: 0198202a
Asset Tag: Channel-1-DIMM-0-AssetTag
Part Number: M471A4G43MB1-CTD
Rank: 2
Configured Memory Speed: 2667 MT/s
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V

OK so it’s not memory model related.

ok @gam your setup is closest to mine.
so let’s try to find pattern.
describe: what exactly you are doing when laptop crash.
what programs you are running what is system load, what kind of external hardware you are using.
how your crash looks like? try to be as precise as possible.
because i will try to play same/as similar as possible scenario, and make my laptop crash in order to find a patern…

So, it has been I while since my last comment on this thread. Here is my current status.

Again, a big thank you to @NineX for his commitment to help us all with this issue.
What I can say so far is that reducing the power usage of the cpu by -5W help to avoid crashes.

echo 10000000 > /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw
echo 15000000 > /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_1_power_limit_uw

For example, I watched two Youtube Videos and did some programming on the side.
My L14 did no crash, when the battery level dropped under the 20% mark. But still around 5% it the laptop switched off. Better than before, but not really satisfying.

So, I gave a shot to PopOS 21.04. With similar stress tests the L14 works great. Furthermore, when you reach a critical battery level, the OS starts to notify you that you should consider to plug in the power supply. But no hard sudden crash.

So what I can say is, that somehow the power management of the latest PureOS seems to have some issues to handle low battery levels. PopOS is an alternative but may not be a solution for people who want stay on PureOS.

I have to admit that I really like PureOS, but as long as this battery issue is not solved, I will have to use PopOS.

Thats all folks. Thank you for all the support so far.

hard switch off around 5% it’s not an “Crash”
it’s normal behavior of every system that i know.
To be more precise: if you switch off couple settings in windows or mac, you will get same results on any hardware.

So please to not mix 2 things.

We had crashes , like documented above: those with graphical “fireworks” or total freeze, reboots…
With those i am interested, because only thing i was able find , reproduce, and help to fix, were memory corruption in PureBOOT , and this was pureboot specific.
i not managed to crash Coreboot/Seabios (regular bios) , Nor reproduce any crashes reported.
And that is actually case i am seeking, to solve.

and we have system just switch off without warning on battery critical.
that case is just matter of tuning settings, can be done by individual up to own Preferences. (i agree current defaults are not the best) , however we can’t protect user from it’s own actios. (battery indicator with pureos go red around 10-15%[i do not sure 110%] and system shows notification about battery going low…)

Unfortunately these settings are not available under Qubes. Any way to use intel-rapl power caps under Qubes?

1 Like

Then I guess this rules out Pureboot or EC having issues with battery management, doesn’t it?

Yes, it looks like. Though, my L14 runs on Seaboot and the latest EC.

I don’t know. I had the same issues yesterday. It always crashed after around 10-20 minutes on battery, although the battery was over 20%. Than I did a PureBoot and EC Firmeware update yesterday evening and today it ran on battery until it was under 10% completely without any crashes…

But I also only noticed the issue after upgrading the RAM, unfortunately I haven’t worried about it at first and haven’t used the Librem 14 much the last few weeks after changing the RAM. So I will see if it really works reliably right now.

It happened again today at 18% battery. But it seams to be better than before the update.

Hey I am just posting here to let others know that this problem seems completely resolved by the most recent EC updates. I hadn’t used my L14 for a long while because of my annoyance with this issue, but it looks like Purism pulled through and fixed things through the EC update process.

The laptop had previously been crashing completely, graphical fireworks or freezing with looped audio at battery levels anywhere between 30%-70%, effectively making it unusable. Since I updated the EC I have been testing for two days and none of these problems have happened. The battery runs all the way down to 5%. I’m not sure if it still does a hard shutdown at low levels because I plugged it back in (I was in the middle of something).

I’ll do some more testing over the coming days and if any issues persist I will post another comment here. Thanks to Purism for getting this update out. I just wish the laptops had been more thoroughly tested before shipping to avoid these issues.

Thank you very much for this valuable information. I think this information is important for new customers, who buy a L14.

Because of this hard shutdowns I switched to PopOS, even though I was happy in general with PureOS.

Yes, the end of October EC/PureBoot update now seems to have fixed it for me too (at least so that I can live with it). It now was reliable until around 10% every time. But then still crashes somewhere between 5 and 10%. So if I set the OS to shut down the laptop or go to sleep at 5%, it still crashes before that.

Same with my L14. Around 10% or less it might crash… Hope, the net EC update will fix this issue.

There was an EC update release last week (v1.6). I’ve flashed it and sadly it brought the problem back as it was before… At around 50% I can reliably crash the Librem 14 with some CPU load. Something like stress --cpu 2 --timeout 30 using only two threads works pretty reliable… With a power supply connected it doesn’t crash. Weirdly also the fan was working a lot less (during the stress command) when plugged in compared to the three times I’ve tried on battery. Even at full CPU utilization when plugged in. @nicole.faerber it seams like you’re working on the EC firmware. Can you have a look at this problem? And is it safe for me to re-flash the old firmware (v1.5) over the new one?