Librem 14 sudden crash when unplugged

Hey I am just posting here to let others know that this problem seems completely resolved by the most recent EC updates. I hadn’t used my L14 for a long while because of my annoyance with this issue, but it looks like Purism pulled through and fixed things through the EC update process.

The laptop had previously been crashing completely, graphical fireworks or freezing with looped audio at battery levels anywhere between 30%-70%, effectively making it unusable. Since I updated the EC I have been testing for two days and none of these problems have happened. The battery runs all the way down to 5%. I’m not sure if it still does a hard shutdown at low levels because I plugged it back in (I was in the middle of something).

I’ll do some more testing over the coming days and if any issues persist I will post another comment here. Thanks to Purism for getting this update out. I just wish the laptops had been more thoroughly tested before shipping to avoid these issues.

Thank you very much for this valuable information. I think this information is important for new customers, who buy a L14.

Because of this hard shutdowns I switched to PopOS, even though I was happy in general with PureOS.

Yes, the end of October EC/PureBoot update now seems to have fixed it for me too (at least so that I can live with it). It now was reliable until around 10% every time. But then still crashes somewhere between 5 and 10%. So if I set the OS to shut down the laptop or go to sleep at 5%, it still crashes before that.

Same with my L14. Around 10% or less it might crash… Hope, the net EC update will fix this issue.

There was an EC update release last week (v1.6). I’ve flashed it and sadly it brought the problem back as it was before… At around 50% I can reliably crash the Librem 14 with some CPU load. Something like stress --cpu 2 --timeout 30 using only two threads works pretty reliable… With a power supply connected it doesn’t crash. Weirdly also the fan was working a lot less (during the stress command) when plugged in compared to the three times I’ve tried on battery. Even at full CPU utilization when plugged in. @nicole.faerber it seams like you’re working on the EC firmware. Can you have a look at this problem? And is it safe for me to re-flash the old firmware (v1.5) over the new one?

Oh dear… so we reverted a change in the EC firmware in this new release that was done in the previous release, more exactly the setting of PL4 of the Intel CPU based on the charger state.

Let’s take a step back for a second. A laptop is a pretty complex system when looking at it from a power consumption standpoint. There are a number of components in that can draw various amounts of power at different times and depending on their use - the main CPU is of course one (15W TDP), the DDR4 RAM, SSD(s), the LCD backlight, WiFi/BT, SD card reader etc. etc. And then there are the external USB port - we have four, two type-A and two type-C ports which can draw significant power if populated. The total power consumption of the device can thus vary a lot.

Coming back to the issue what we tried to do is to set the Intel CPUs PL4 to some sane value when in battery only mode in order not to overload the battery. The battery has a limited power budget which is lower than the total maximum power that can be consumed by the system. In order not to overload the battery we clamped down the Intel CPU’s PL4 to 20W and allowed it to go up to a much higher value when on AC.

Since about the same time we introduced this change we observe two things: 1. the sudden power down seems to be gone (and now reappearing) but worse 2. we also saw quite a number of dying main boards, you probably have read about this too already. In the broken main board in most cases the charging does not work anymore. The boards will still work from battery but will not charge anymore. In the path of the charging current is a main power switching IC that has to regulate the current from the DC charger input to the whole system, i.e. all power supplies plus charging the battery.

These currents are hard to measure, you would have to break up traces etc. to measure individual current flows. My current work hypothesis is that we did set the max power when connected to charger too high and overloaded this switch when the system was under higher load (plus charging) and thus caused the dying main boards.

With reverting this change we may now have some default PL4 (which is badly documented in the Intel docs) which is higher than the battery case we set it to in the previous version and thus the sudden power off happens again. But I hope that the dying main boards do not happen anymore! It may sound a bit crude but for now I prefer system to shut down rather than causing permanent hardware damage.

If during the next couple of weeks we do not see the dying mainboard problems happening anymore we will need to revisit the PL4 settings or other means again.

In the meantime you could also try to limit the CPU power consumption while using it from battery using the PL1 and PL2 settings within Linux:

/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw
/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_1_power_limit_uw

These two values are for the long term (0 = PL1) and short term (1 = PL2) power budget of the Intel CPU, in µW. The TDP of the 10710U is 15W, PL2 can be something like PL2= PL1 + (PL1 / 3)

With that you can limit the Intel CPUs short and long term average power consumption. In the L14 the default PL1 is 15W and PL2 is 20W, bringing these down by a few Watt should help already, e.g. 10/15W:

echo 10000000 > /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw
echo 15000000 > /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_1_power_limit_uw

Please accept my apologies for this inconvenience, we are working on it and will implement better / safe defaults as soon as we have narrowed down the reason for the dying main boards, which right now is my primary concern.

Cheers
nicole

5 Likes

Sorry for the long delay. But I’ve read your update that time and really appreciated the amount of background information and technical details! This is something you won’t ever get from other companies and to me makes it ok to wait a little, until everything works completely.
I haven’t had time to test the manual configuration back than (needed the laptop to just work and used an USB-C power bank). And then you published v1.7 which resolved the issues, so there was no need anymore :slight_smile:

EC Version v1.7 all in all worked pretty flawlessly.

Unfortunately last weekend I’ve updated to v1.9 (and PureBoot 21) which again brought some problems. The laptop now doesn’t charge anymore while it’s turned on (over USB-C, I haven’t tried using the DC jack, but expect it to be the same). The LED turns green when the charger is connected and it doesn’t lose any battery charge, but it won’t recharge (or only around 1% per hour). But when I turn the laptop off and reconnect the charger, it will charge.

Also I had one sudden power off so far with 1.9, but this time it was while it was connected to my USB-C to DP docking station and thus also to power (it was also connected to a USB-A HUB + 5GbE adapter and keyboard and mouse, also HDMI was connected).
I haven’t experienced this with 1.7. But unlike with 1.6 I couldn’t reproduce this so far.

It could also have something to do with my USB equipment. I’ve experienced before, that the USB-C 5GbE adapter led to a sudden power off or system freeze (also one of my USB-C Hubs did). I thought it was a driver/software problem (or a hardware problem of the hub). But maybe it was also related to power management. My PC (running the same Debian version) haven’t had such a problem with these adapters, but also was a bit picky on how the 5GbE adapter was connected and so far using a different hub and the USB-A Port instead of USB-C Port of the laptop helped mitigate these problems.

I think there’s been a lot of progress in this regard and things look a lot more mature now, IMO.

Just to add some of my experience, that hopefully it will lead to improvements also in these cases:

On EC v1.9 and everything else updated, I’ve also experienced an instant power off, while being on battery under 50% (47% if I recall right).
I had a 4TB 2.5" USB3 HDD connected to the right side USB-A port, and doing some continuos read test from an 860 Evo 2TB SATA SSD in an M.2 to USB enclosure (left USB-A port).
I then plugged an USB-C charger, booted it up ( and it seemed ok afterwards.

Another case regarding v1.9, while playing aroud with the librem-control app (which is still in beta) - setting charge limits and start/stop charge, WiFi led settings (didn’t seem to work) and temporarily having “purism_ectool console” running, I got a hard lock/freeze. It happened also in previous EC revisions and it’s quite a problem.
The keyboard didn’t register any keys (including power off - keeping >4s doesn’t do anything), the power stays LED stays on and I have to open up the underside panel (not a quick job - whatch your nails) and then unscrew all the battery screws, so I can unplug the battery connector - the topleft tab of the battery blocks the connector from being fully disconnected. I then lost the system time - my L14 (2021) didn’t came with a RTC CMOS battery.

Afterwards, I reduced PL2 to 17W and when I’ll have a similar scenarios, I’ll report back.

Oh dear… this TI charge controller is driving me mad…

So… regardung the power offs, yes, I am sorry, I figured that out. I wanted to protect the battery from deep discharge so I query the battery flags and if it signals “stop discharge” I would cut power. But it turns out that these flags are not reliable :frowning: They trigger too often or prematurely so I removed that again and that should work better now.

The charge controller though is another beast and I honestly do not know what is going on there. Just last week I first noticed too that the charge current would drop dramatically during charging, in my case almost reliably when starting to charge below 40%. It would start with 2A, which it should, but some time later drop to like 100mA or so. I have no idea why this happens. What helps here is simply to stop charging and restarting it again, then it usually continues at 2A until the end threshold is reached.

I am working on a new branch right now:

This implements the fixes for the power off but more importantly also implements a new charging strategy. We will then, if this tests good, charge at different currents depending on the charge level - starting slow at low %, increasing toward the middle and then decrease again towards the upper %ages. This should on average speed up charging while still reducing wear on the battery. I hope that this also helps with the charge hickups some see.

In this branch I have also again implemented even more safeguards against false battery readings. The batteries implement a “Smart Battery Controller” (SBS) chip which reports the battery state and also design data, like charge voltage, charge current, design capacity etc. Turns out that sometimes batteries report wrong values! And most often I found the charge current to be 0mA - WTH!? This also causes of course totally absurd charging conditions. In this branch I implemented some safe defaults in case the battery reports non sense. The safe default is not fast but safe and should work good enough.

The battery information is BTW read each (once) when the EC powers up. This either happens when plugging in the charger or, when charger is not connected, when pressing the power button. So if you suspect wrong battery information just power off, disconnect the charger and reconnect / power on again. It also seems that the less charge the battery has the less reliable the readings are.

Finally regarding the ectool console, yes, this can, for reasons I do not know yet, hang the EC. It usually does not happen if you have the console running for just some seconds, but the longer it runs the more likely the hang gets. If that happens the only remedy is to unplug the charger and the main battery :frowning:

So… if you feel a bit adventurous please give this post-1.9 branch a try, it could fix quite some of the problems.

Cheers
nicole

4 Likes

Thanks for your hard work on this!

One update form me. I just had another crash (this time more like a freeze, but with a black screen, with some white lines on it). It was not connected to the charger this time. Battery charge was around 40-50%.
Also for me since 1.9 it does not charge at all while powered on, not even for a few minutes after it’s connected to the charger. I’ve even checked the charge_now parameter in /sys/class/power_supply/BAT0 It’s not changing. Only when the laptop is turned off. Could there be a problem for me re-flashing 1.7?

If you tell me how to flash/compile this new version (haven’t really found instructions besides “make BOARD=/ flash_internal”) and that it’s already safe for the hardware, I’m also willing to try the new version.

Edit: It does charge a bit, when turned on, but the display is off and it’s just idle.

I am getting to the bottom of this… and I have committed some more updates to the before mentioned post-1.9 branch.

First there was still some power off code in the EC that obviously gets triggered prematurely. The batteries seem to be notoriously unreliable about these flags which is why I have now completely removed checking for them :frowning: So sudden power off should not happen anymore :crossed_fingers:

The charging problem is an interesting one connected to the power budget, which is probably also the reason why I did not catch this earlier, since I am testing different power scenarios than others, it seems.

So what is happening, I think, is that charging at higher currents can exceed the total power budget of the AC adapter if in parallel some other high power consumers (like high CPU load etc.) draw power. If that happens the charge controller falls back to some pretty weird state which causes this super slow, if at all, charging - which clearly is not correct. So in my current work branch I am doing two things now:

  1. set more conservative charging currents again
  2. reduce Intel package PL4 to the battery / DC lower Watt level while charging to prevent the Intel package from high power draw (and set it to AC level again once charging has finished)

This should prevent the charge from brown out and charging should, hopefully, work normally again. I will test this more, now that I know what I need to be watching for.

I hope this works out better now and stable enough to make a new release soonish.

Cheers
nicole

5 Likes

For me ec_v1.9 work really good, like better speed up charging when laptop is ON, better battery status info. The only problem I have in v1.9 is that turn OFF when it reaches 10%.

@nicole.faerber some Librem 14 do not come with build-in RTC CMOS battery?

Yes, the initial batch we shipped came without. This should not make any difference usually since the main battery should never fully drain and keep the CMOS RAM intact. But to be really sure we added it in the later batches.

Cheers
nicole

2 Likes

how can i tell if my librem 14 has a separate battery for the rtc?

1 Like

can we get battery specs, and where to connect it ?

1 Like

I believe it’s a CR2025 (diameter 20mm), see pictures:


Sorry for the picture quality, that’s my skill level w/ Millipixels on L5, w/ a halogen desk lamp in the evening.

For connecting to the L14 MB, see this picture:

1 Like

Hi Nicole

I was wondering if you have some new insights concerning the sudden shutdowns? I updated the EC Firmware to latest versions, but my L14 still keeps crashing when the Battery gets in the range of 20% - 30%.
I really hope you can nail this problem as soon as possible, because this behavior is quite annoying…

Thank you in advance for your commitment to fix this issue.

Best regards,
Giacomo

Hello all, I’m a bit confused about how to test out the post-19 branch mentioned above. Would I be compiling the firmware from source in order to do so? I’d rather install it in the expected way. I don’t seem to see a .rom file for it.

Nevermind, just found the newer ec_v1.10. Didn’t realize there was a recent release.

@nicole.faerber Hi, first of all thanks for all your effort and trying to get this fixed!
Unfortunately I have to report, that my laptop again crashed at around 30-40% with EC version 1.11 today. I’ve installed the new EC version a few days ago.