I’m seeing this as well on my 13v2 since updating coreboot. In addition, the CPUs appear to reach crit temperatures under high load forcing an immediate power off. Disabling Intel turbo-boost seems to help with this, as does forcing powersave cpufreq powersave profile. But i didn’t have to take any of these measures before the coreboot update so I’m hoping there’s a fix.
Seeing more fan use, higher CPU temps on my 15v3 after running this latest coreboot script too.
Also, and probably related: when I close it for the night I wake up and the battery’s dead. Either it’s not sleeping properly or when it’s sleeping it’s using much more power than it used to.
Thanks again for all your persistent work on this @kakaroto!
Unfortunately I have to confirm the issues on my brand new 13v2 with TPM after coreboot update. I installed Qubes 4.0-rc4.
- fans behaving erratically
- sudden power-off on high CPU load
Before installing Qubes and updating coreboot with default PureOS installation I had one instance of:
- closing lid and witnessing it draining battery over night
At one point I was able to reliably reproduce issue #2 by booting into Qubes, starting Terminal in one existing fedora-based Qube and doing
yum install curl. It was enough to power-off my notebook 2 times in a row. And third time I was able to bring “sensors viewer” app up, decreased update interval to 1s and observed CPU temperatures. The numbers were wildly jumping between 70 to 100 degrees celsius (according to the sensors viewer). I switched to sensor type
coretemp-0 and observer all three sensors before doing another
yum install and as soon as all three sensors
Package id 0,
Core 0 and
Core 1 hit 100 my laptop did power-off. Interestingly enough before that happened some of them were flirting with 100 but notebook shut down only when exactly all three got to 100.
I don’t think it is physically possible for that temperature to go from say 80 to 100 in matter of a second. There must be something wrong with the readings and that would also explain #1, because fans simply switch on/off based on those wild values.
One relevant change in the new coreboot 4.7 is that we enabled Intel speedstep where before it was disabled. While it’s pretty clear that might impact CPU scaling, it’s still a bit unclear why enabling that would cause CPUs to heat up so rapidly.
I’m still investigating here, but installed a simple Gnome CPU Power Manager extension in my panel so I could watch it and change the profiles. I noticed that a heavy compilation that in the past would cause my system to power off with this latest BIOS worked ok when I throttled so max CPU was at 80%. I’m wondering whether before the update the CPU was always set to powersave and/or turboboost was always disabled before.
I’m going to experiment with a few different CPU profiles and see if I can figure out what combination triggers the issue.
Update: I ran the same test only this time I let the CPU scale up to 100% and the only difference was I disabled Turboboost. I was able to compile the same large software package without any problems this time as well.
If you look into the relationship between SpeedStep and Turboboost you’ll see a number of people talking about how Speedstep must be enabled for Turboboost to be enabled. I suspect what we are seeing here is that Turboboost is now enabled where in the past it wasn’t, which is causing the CPU during load to spike beyond its normal 100%. That spike up from the max of 2.5Ghz to 3.1Ghz under load is causing the CPU to overheat and then power off.
If you are having this issue, try disabling Turboboost:
sudo bash -c "echo -n 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"
And see if things improve.
I suspect, even more now with others reporting the same thing, that it is a temperature sampling issue. I dual boot PureOS and Arch/Antergos, and running TLP (power management) on the latter. When running on battery (Arch), the CPU scaling governor keeps the processor’s frequency well below 1 GHz. However, connecting the laptop to its charger increases the minimum frequency to roughly 2 GHz. This in itself isn’t the problem, and the initial fan spin up resulting from this increase is quite quickly calmed, but stressing the CPU from here on will increase the likelihood of peaks/erroneous data.
So, yes, this part is probably related to the Turboboost-part. I’m guessing that it’s a simple division error, in which a batch of temperature data is divided by old (read: lower clock speed) processor data. Less aggressive scaling will then simply remove most of the trouble.
However, what might be most convenient, is to set the fan’s (or processor’s) temperature reference to the sensor called “pch_skylake-virtual-0” instead of the two “coretemp-isa-0000” core_0 and core_1 values. From what I’ve read, people seem to suggest it’s a real sensor, but it averages perfectly between the two core temperatures without peaks/errors.
I don’t know if this is possible in userspace, as I haven’t found any information on how to set it (as stated in my previous reply: I haven’t even found the fan), but setting it up this way might be preferable in the short run.
Yes, I had an issue. I followed the directions from a fresh iso of Kali and when it asked me which laptop I have, it would not let me choose Number 4. I am reinstalling PureOS to see if it changes from there.
[Edit]: I can confirm that this works best from PureOS.
Or I can confirm that it does not work that well with Kali.
That command would have worked if you were on PureOS I assume, it probably didn’t because you’re not using PureOS and your distribution didn’t update the git version yet ?
Technically, you didn’t need to update to git version 2.16.1, the coreboot image that was built is still valid, the only problem is that you couldn’t be 100% sure that the image is safe (there is no reason for it not to be) and it would have shown as version “4.7.12-g83ed1b0” instead of “4.7.12-g83ed1b0f01” in the coreboot debug log, and you would have had to run the flashing command manually instead of the build script flashing it for you.
Updating git was better though because you can have a guarantee that the image has the same hash as everyone else’s and you can flash it safely.
@Freeforall @hazybluedot @mpc @darwin: This shutdown issue is indeed problematic, and I hadn’t experienced it before (I don’t compile or use much my librems other than for actual coreboot development/testing) but I left @Kyle_Rankin to look into it today and he says disabling Turbo Boost fixes the problem. The reason we enabled SpeedStep was because during IOMMU testing, the Qubes installer was complaining about “kernel modules not loading” and it was the xen-acpi-processor module that failed to load and after debugging it, I found that it was because it couldn’t find the P-state and C-state information in ACPI. The speed step option is what makes coreboot write that information in ACPI, however, there’s a possibility that the information coreboot writes in ACPI is not accurate or that the fact that it’s there (turboboost) is what’s causing this. we’re both (Kyle and I) looking into this so hopefully we’ll have a fix soon, for now, disable turboboost like Kyle suggested.
Thanks for reporting these issues!
Please re-download the build script and re-run it. The 4.7-Purism-3 version should fix the sudden shutdowns.
A quick summary :
- The FSP 1.1 (previous version) allowed temperatures to increase slowly (over dozens of seconds) when we stress test the CPU
- The FSP 2.0 (current version) seems to cause the CPU temperature to go from cold to 100C within a few milliseconds!
- The TCC (Thermal Control Circuit) which regulates the frequency/voltage/temperature (or whatever it does…) gets activated at 100C
- The CPU will shut down when it reaches 100C
- The CPU shuts down at the same time as the TCC activates, which means the TCC is useless
- Current fix is to set the TCC activation temperature to 95C, this will cause the CPU to reach 95C, maybe even 96C but never go above that. It therefore prevents overheating shutdowns
- There is still a bigger issue here of “why does the CPU temperature go from 50C to 100C in less than a second now with FSP 2.0 rather than 50C to 80C within a minute like before with FSP 1.1?”. The same Turboboost 3GHz frequency is used in both cases, but the behavior is different
- Next week we’ll research this issue, see if the FSP 2 changed some configuration defaults or something which needs to be properly set to prevent the problem, and hopefully we’ll have a fix for you soon
- In the meantime, the 4.7-Purism-3 version will at least prevent your PC from shutting down every time the CPU is used.
Have a nice weekend!
Received Librem13V2 - quite happy, except for electronic 'chirping'
Intel ME Disabling
i’m having the same shutdown issues with my brand new Librem 13v3. But… I don’t know how to update the bios. I was instructed to choose 13v2 at the menu, and defaults for other prompts, but it just fails with “No files to extract” after answering the “1 - Extract from the current machine” prompt. I don’t want to completely brick the machine, so what should I try next?
To give an update, I’m currently running with no_turbo and everything seems stable, though obviously I’d like to get the new bios installed and get the full cpu power.
I figured the monitor part out and made a topic documenting it in case it’s useful for others:
On a maybe related note? I usually run plugged into my monitor via HDMI at 2560x1440 and while this was working perfectly with my Librem 15v3, on my Librem 13v3 it blanks out for a second or two every now and again. Seems to be every 10 minutes or so at best, and for a bit there it was blanking every couple of minutes. No idea if it’s related, but it sure seems suspect.
We’ll be emailing everyone that has received a laptop in the past week or so to update with these steps below but here they are as outlined/authored by @Kyle_Rankin:
We have discovered a regression in the latest coreboot BIOS that was
installed on your laptop. This regression causes the CPU to heat up faster
and higher than it should and under high load it can cause the system to
spontaneously power off. To fix this issue will require that you flash your
BIOS with our latest version of coreboot. All of these steps will require
that you run commands in a terminal & connected to the internet (or download the files separately).
First, temporarily disable the TurboBoost feature in your CPU. This will
prevent the laptop from shutting down while it is building the coreboot
sudo bash -c “echo -n 1 > /sys/devices/system/cpu/intel_pstate/no_turbo”
Next, update coreboot:
Download the build script
mkdir building-coreboot && cd building-coreboot && wget https://code.puri.sm/kakaroto/coreboot-files/raw/master/build_coreboot.sh
Install the required dependencies:
sudo apt-get update
sudo apt-get install git build-essential bison flex m4 zlib1g-dev gnat libpci-dev libusb-dev libusb-1.0-0-dev dmidecode bsdiff python2.7
Run the script on your Librem machine:
chmod +x build_coreboot.sh && ./build_coreboot.sh
Follow the instructions on the screen, and BE SURE to select your
correct Librem laptop revision (Librem 13v2 or Librem 15v3, select Librem
13v2 if you have a Librem 13v3), and give it time to build the image.
Beyond selecting your specific laptop revision you can select the default
choices for the rest of the script.
Once done, if everything went according to plan, it will ask you if you
want to flash the newly built image.
Make sure you are not running on low battery and select Yes.
Reboot your machine once the flashing process is done.
Once your machine is rebooted, you should no longer have any spontaneous
power off issues."
Received Librem13V2 - quite happy, except for electronic 'chirping'
Frequent unexpected power-off with latest coreboot
Indeed, those are the instructions I followed, but after I chose 13v2 at the menu (for my 13v3), and then the defaults for other prompts, it just fails with “No files to extract” after answering the “1 - Extract from the current machine” prompt.
For now, I’ve just put the no_turbo setting into my /etc/rc.local and I’m “stable” for now at least. Look forward to fix in the coming week. =)
I can confirm that, I have the same problem.
Same problem here. Did the coreboot update a few days ago and got the overtemp shutdown on my 13v2. Laptop shutoff 3 times doing various levels of work. I’m working on applying the update now.
On a side note, what is a 13v3? Im not seeing anything about that on the site and I just got my 13 a few weeks ago. What changed? Is that just with the TPM installed?
UPDATE: Disregard. This was just a failure on my internet. Re-ran everything successfully with a 13v2 (a bout 4 weeks old with TPM).
- gmp-6.1.2.tar.xz (downloading from https://ftpmirror.gnu.org/gmp/gmp-6.1.2.tar.xz)… 0%… Failed to download gmp-6.1.2.tar.xz.
Makefile:26: recipe for target ‘build_gcc’ failed
make: *** [build_gcc] Error 1
Makefile:48: recipe for target ‘build-i386’ failed
make: *** [build-i386] Error 2
util/crossgcc/Makefile.inc:46: recipe for target ‘crossgcc-i386’ failed
make: *** [crossgcc-i386] Error 2
In the event that we don’t yet have access to the internet, could one download the needed updates, software etc onto a USB and then install from the key?
I refer to this of course:
“You should first install some of the dependencies needed to build coreboot, with this command:
apt install git build-essential bison flex m4 zlib1g-dev gnat libpci-dev libusb-dev libusb-1.0-0-dev dmidecode bsdiff”
Thanks a bunch for any help, much appreciated
I ask such because when I try to enter the command line into the terminal it tells me I don’t have permission. The exact message from the home terminal is
“E: Could not open lock file /var/lib/dpkg/lock - open (/var/lib/dpkg/), are you root?” I assume this refers to my not being connected to the internet?
On a side note, my terminal seems to have a configuration issued that is detected and shown whenever the terminal is opened. It says the issue is not major and gives a link to github on how to repair it, or just ignore it and use the terminal. I gather that has to do with modifications made to the hardware? At any rate, the terminal works, I just thought I’d put that out there but imagine it’s been handled already in another thread.
Same here, my Librem is a 13v2, script fails / ends with “No files to extract”.
You will definitely need to be connected to the internet for this script to work. Connect the laptop to the internet and try again and see if it works now–it could also be that one of the upstream files failed to download last time.
When I tested this on Friday everything worked, and I’m testing it right now and so far so good but if I get a failed download I will report back.