Is anyone else experiencing freezing issues with Librem 15 v3?

I’m running stock PureOS, and try to pull in updated packages every day. Within the last week, I’ve started having system freezes that require a hard reboot.

This seems to largely happen when using Firefox. I’ve updated to Firefox 55, and although it helps performance overall, there are still times when the simple act of watching an HTML5 video in-browser causes the hangup again.

I wish I knew of a way to trace the issue, but the nature of the system freeze leaves everything inaccessible, leaving me with only guesses as to probable causes. It would seem that this could be caused by a memory leak in Firefox 55, but I’m curious as to whether some other issue in PureOS may be compounding the problem into what it is.

I’m running on Wayland by default, if that helps any. Does anyone have any advice for working around this?

Edit: I’ve decided to disable all Firefox extensions, and will test for the next 24 hours to see if that has any effects. It could be that a bad Gnome-Shell extension is causing the entire shell to hang without properly crashing. If the problem continues, I’ll look into running on X11, and then simply booting into a lower version of the Linux kernel.

Edit 2: Problem persisted even when running on X Gnome session. Some part of me suspects this may be related to booting into the latest kernel version for PureOS, which is from trunk.

2 Likes

Is it normal for Firefox to bring four CPUs to over 30% each? This is running F55 with one tab open, watching an HTML5 video.

1 Like

I have had a hard freeze on PureOS during an upgrade IIRC. I switched to my Arch Samsung 960 EVO SSD (it had previously been in my 15 rev2) and had a hard freeze when I had lots of raster images loaded in QGIS and firefox also open with many tabs.

1 Like

I installed Debian 9 on m laptop when it arrived, and it’s now hung unexpectedly at least 4 times in the last few weeks. I had one just now; before that was 2 days ago, and again 5 days before that.

Today, I walked away from my desk to talk to someone, came back, and it had hung. It wasn’t doing any heavy processing. A couple of Chromium windows open with maybe 20 tabs in total, and Thunderbird running.

I’d also love to know how I can debug this issue. The only thing I see at the moment is the following in /var/log/emssages:

Could it be related to networking? An SSL error occurred prior to freezing (although I don’t know how long before…)

1 Like

I noticed a Firefox freeze yesterday which required a hard reboot. Only firefox window froze, the rest of the desktop was fine, I rebooted to install some recent update.

So far so good.

I am running firefox 55.0.3.

In a recent pm from the good people at Purism, I was informed that PureBrowser has got an update which fixes the previous issues with certain sites being blocked. PureBrowser is apparently Firefox with a few add-ons. I am going to try it again.

1 Like

Hi folks,

I’ve had similar issues, so far managed to freeze at different places:

  • memtest with smp forced and all cpu in parallel (but it didn’t freeze in 10 hours running sequentially one cpu at a time)
  • playing a video while doing some other processing (e.g. running an update in the background)
  • firefox (well, purebrowser) sometimes when opening new tabs as that uses both some CPU and moves things on the screen and accesses the disk…

As mentionned here, it didn’t freeze with memtest on sequential so I’ve tried the same on linux (disabling all cores but one), for example as root (this will not stay on reboot):
# for cpu in /sys/devices/system/cpu/cpu{1,2,3}/online; do echo 0 > $cpu; done
This is a shame but it’s been flawless since.

I’m still looking for better reproducers, it does happen fairly often, it might be something with power management/power state change (happens when loads picks up) coupled with either disk or graphics. Hopefully can figure some reliable reproducer so we can confirm we have the same issue… Sometimes everything freezes sometimes just an app etc.
At least single core is stable enough and very useable for daily stuff, just a shame.

@Asmadeus I just replied to @mladen concerning your issue, so I expect he forward my answer to you already, but I just wanted to add something : memtest crashing with SMP forced is “normal”, it happens for everyone and on every PC. I think it’s simply a bug in memtest that was never fixed since memtest is abandoned and the latest version is from 4 years ago.

2 Likes

Oh, wasn’t sure about that, thanks for the info!

So that restores the probability of a kernel/driver issue again, which is good.

I haven’t had much luck with reproducing voluntarily (although I didn’t find much for gpu testing and whatever I used for disk load might not be very close); so I’ve just restored back to 4 cores and will see if I can get a few proper crash dumps to see if I get cpu stuck in similar places… Hopefully will be able to trigger the crash somehow (sysrq didn’t work iirc, maybe ssh?)

So:

  • Little progress on a stable reproducer so far, my best pattern is playing a couple of videos (start mpv with a loop on two files, and mute to not turn crazy, then shove these to a different workspace) ; then “do stuff” (browsing the web or whatever) – usually crashes in 15-20 minutes.
  • (I’m not sure it’s related but there’s some tearing on the display, that I had never noticed with other intel gpus; but others tell me it’s common on these chips… You can see it from time to time playing videos, or rapidly changing workspaces (needd a wm with no transitions))
  • I’ve had mpv, purebrowser, X… segfaults at different places ; while things are pretty stable with a single core, so I’d assume some instruction returns garbages at some point. Unfortunately didn’t configure systemd-journal to be persistant so the coredumps are gone with every full crash
  • sysrqs do not appear to work once everything is stuck, shame!
  • ssh does work, but triggering a crash from ssh did not (it did freeze the ssh session as if it had worked, but screen didn’t reset to let me enter the luks passphrase. No passphrase, no dump, no cookie – I’m considering having it dump in plaintext, but there is going to be the passphrase in the dump itself… I think I’ll just take whatever traces I can from ssh over the weekend, will have something more comfortable than a phone to ssh from and diagnose.
  • For what it’s worth, I tried feeding i915 the firmare it was asking for (skl_dmc_ver1_26.bin) with no change, so that’s not it ; back to blob-free now. I also see the frequencies changing all the time, I wonder if forcing performance governor would help… Something to try.

Still room for progress, more in a couple of days!

It’s difficult to get proper traces but I’m getting more and more convinced it’s a problem with i915 and power states changes. I still don’t have proper a proper reproducer but it has to be something that uses GPU, and playing small resolution videos in a way that keeps the cpu low or loading the cpu artificially both don’t trigger freezes from what I could experience… Talk about picky!
So, for now, forcing the CPU governors to performance seems to workaround the hangs - it’s not exactly practical when on battery, but it should be good for anyone who can afford being glued to the wall.

Arch wiki suggests booting with i915.enable_rc6=0 which is the same idea (that is a toggle to allow some power-saving “C-state 6”), which would be good middle grounds except that it does not seem to work… sigh
There’s another parameter to play with (intel_idle.max_cstate that can take multiple more aggressive values), I guess I’ll try that next. I really wish I could find a solid reproducer though!

1 Like

Have you tried to see if there is a bug in the driver that was already fixed? What kernel are you on ? Is there a newer kernel available? If not, did you try an older kernel perhaps as it might be a new bug instead ?
I think that’s the best bet for tracking/fixing this, as I assume everyone would be complaining about it, but since it’s not widespread, it might be related to your specific configuration.
Also, what ditro are you using, and did you try installing a different distro to see if the issue happens on that other distro as well ?

I’m running on stock PureOS like the original poster, so kernel 4.12.0, mainly because that’s what I thought would have been tested more extensively.

I’ll try different kernels/with or without the dmc blob combinaisons over the next weekend, starting with newer. I do not believe other distros would help (aside of having different kernels I guess). I have seen people complain about i915 for skylake on different forums/bugzillas though so I think it is actually a fairly common issue, despite not having encoutered such problem on an i5 skylake NUC before myself…

specific hardware is: librem 15v3 with 8GB of ram (thoroughly checked), samsung SM961/PM961 500GB nvme drive (but I can reproduce problems after preloading videos in ram and not accessing the drive much), with wireless on and camera off. I’m pretty sure it’s graphics stuff though so only related to the CPU (i7-6500U). We’re at least 4 or 5 people reporting the problem, even if the others are less vocal, so definitely something fishy somewhere :slight_smile:

1 Like

I am also experiencing periodic system freezes that require restart using the power button.
It is not firefox related. I have had it happen even when I only use the terminal.

I have found that the last message on the system log is:

Key repeat discarded, Wayland compositor doesn't seem to be processing events fast enough!

repeated over and over.

I’m using kernel 4.12.0-2-amd64

I guess your terminal or wayland compositor might be using opengl / graphics acceleration so it’s hard to tell. I’ve never had a crash here with plain ol’ X11, wmii and urxvt. It only starts when I add mpv or firefox or something that will be using more graphical features.

@kakaroto: Tried today’s master (4.14-rc2) and an old 4.4.88 (latest stable in 4.4 branch); neither seem to change much.
I wanted to try further but can’t boot into 3.10 on this system as the default formatting for pureOS is ext4 with some non-backwards-compatible flags… I’ll try with some live OS maybe later but not convinced at this point, the librem 15v3 didn’t exist back then and 3.10 will likely just not have skylake igpu support.
The “good” thing is that both had neat crashes in i915 driver, I’ll just post to intel-gfx@lists.freedesktop.org with these trace and see if I can get anything out of it. Also noticed some “strict debug” options while compiling, I’ll turn them on now, maybe that’ll get me neater stacks.

Hummm… I didn’t experience this problem on my librems, but I use them mostly for testing/debugging, not for everyday-use, so that’s probably why I didn’t trigger the error.
Do you have the crash log from the i915 driver? It might not be the driver itself, but a combo between the driver and wayland. I wonder how easy it would be to switch your OS to using X instead of wayland.
A quick search gave me this : https://bugs.freedesktop.org/show_bug.cgi?id=100181
It’s possibly the same bug you’re experiencing, and it seems to have been fixed in wayland itself, could you check which version of wayland you have ?

Hmm, actually good point. I’ve already had switched to X11 (I’m using wmii, a lightweight tiling WM ; working on sway as wayland replacement but it’s not good enough yet for me), so I didn’t try much on wayland.
Even when I had, I had left mpv to its default so using the opengl/x11 backend (sigh, mpv…).
I’ve just switched back to gnome-wayland and tried forcing opengl-output=wayland however to no difference.
It was worth a try, though!

This also got me tempted to try without output (–vo=null) to disable opengl usage, and it looks like there is no hang so we can say it’s something to do with opengl and not about the decoding part of ffmpeg/mpv. I’ll re-run that longer tonight to confirm.

As for the traces, well, it’s hard to say. I had actually mistaken the boot warning stack (something about not enough wires for DP? will post when I get home if it’s not obvious) on the 4.12 kernel, so only the old 4.4 kernel had an i915 stack and frankly I don’t want to barge in on the list with such an old kernel report, so this weekend I’ll post what I have anyway without that one.

What I see basically looks like memory corruptions. This morning’s logs are actually quite good, starts with something random:

Sep 29 08:03:58 fenrir kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000246
Sep 29 08:03:58 fenrir kernel: IP: __list_del_entry_valid+0x29/0x90
Sep 29 08:03:58 fenrir kernel: PGD 0 P4D 0
Sep 29 08:03:58 fenrir kernel: Oops: 0000 [#1] SMP
Sep 29 08:03:58 fenrir kernel: Modules linked in: ctr ccm fuse cpufreq_powersave cpufreq_userspace cpufreq_conservative snd_hda_codec_hdmi ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt arc4 nf_conntrack_ipv6 ath9k nf_defrag_ipv6 ath9k_common ipt_REJECT nf_reject_ipv4 ath9k_hw nf_log_ipv4 nf_log_common xt_LOG xt_recent ath xt_limit xt_tcpudp snd_soc_skl mac80211 xt_addrtype snd_soc_skl_ipc snd_hda_codec_realtek snd_hda_codec_generic snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match snd_soc_core intel_rapl snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_hda_codec coretemp kvm_intel snd_hwdep snd_hda_core kvm cfg80211 snd_pcm snd_timer irqbypass snd intel_cstate intel_uncore joydev intel_rapl_perf pcspkr serio_raw sg iTCO_wdt iTCO_vendor_support soundcore rfkill nf_conntrack_ipv4 nf_defrag_ipv4
Sep 29 08:03:58 fenrir kernel:  xt_conntrack shpchp intel_pch_thermal battery ac topstar_laptop sparse_keymap processor_thermal_device evdev intel_soc_dts_iosf int340x_thermal_zone ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack libcrc32c crc32c_generic parport_pc ppdev lp parport iptable_filter ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto btrfs xor zstd_decompress zstd_compress xxhash raid6_pq algif_skcipher af_alg dm_crypt dm_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel i915 ghash_clmulni_intel pcbc video i2c_algo_bit drm_kms_helper i2c_i801 psmouse aesni_intel prime_numbers ahci xhci_pci aes_x86_64 crypto_simd cryptd glue_helper libahci nvme xhci_hcd drm libata nvme_core usbcore scsi_mod button
Sep 29 08:03:58 fenrir kernel: CPU: 1 PID: 5781 Comm: mpv/ao Tainted: G        W       4.14.0-rc2 #14
Sep 29 08:03:58 fenrir kernel: Hardware name: Purism Librem 15 v3/Librem 15 v3, BIOS 4.6-a86d1b-Purism-5 07/27/2017
Sep 29 08:03:58 fenrir kernel: task: ffff924822b20040 task.stack: ffffa4fa83880000
Sep 29 08:03:58 fenrir kernel: RIP: 0010:__list_del_entry_valid+0x29/0x90
Sep 29 08:03:58 fenrir kernel: RSP: 0018:ffffa4fa83883cb0 EFLAGS: 00010203
Sep 29 08:03:58 fenrir kernel: RAX: 0000000000000000 RBX: ffffa4fa837fbd58 RCX: dead000000000200
Sep 29 08:03:58 fenrir kernel: RDX: 0000000000000246 RSI: ffffa4fa80d88448 RDI: ffffa4fa837fbd60
Sep 29 08:03:58 fenrir kernel: RBP: ffffa4fa83883cb0 R08: ffffa4fa837fbdb8 R09: ffffa4fa80d88448
Sep 29 08:03:58 fenrir kernel: R10: 0000000000000001 R11: 000000007fffffff R12: ffffa4fa837fbd60
Sep 29 08:03:58 fenrir kernel: R13: ffffa4fa837fbdd0 R14: ffffa4fa837fbdc0 R15: ffffa4fa80d88448
Sep 29 08:03:58 fenrir kernel: FS:  00007f54175c0700(0000) GS:ffff92483ec80000(0000) knlGS:0000000000000000
Sep 29 08:03:58 fenrir kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 29 08:03:58 fenrir kernel: CR2: 0000000000000246 CR3: 000000026c196004 CR4: 00000000003606e0
Sep 29 08:03:58 fenrir kernel: Call Trace:
Sep 29 08:03:58 fenrir kernel:  plist_del+0x3b/0xc0
Sep 29 08:03:58 fenrir kernel:  __unqueue_futex+0x2f/0x40
Sep 29 08:03:58 fenrir kernel:  mark_wake_futex+0x3d/0x50
Sep 29 08:03:58 fenrir kernel:  futex_requeue+0x8a9/0xa40
Sep 29 08:03:58 fenrir kernel:  do_futex+0x2ae/0xb10
Sep 29 08:03:58 fenrir kernel:  SyS_futex+0x13b/0x180
Sep 29 08:03:58 fenrir kernel:  ? SyS_write+0x79/0xc0
Sep 29 08:03:58 fenrir kernel:  entry_SYSCALL_64_fastpath+0x1e/0xa9
Sep 29 08:03:58 fenrir kernel: RIP: 0033:0x7f5454a2d91d
Sep 29 08:03:58 fenrir kernel: RSP: 002b:00007f54175bf8e8 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca
Sep 29 08:03:58 fenrir kernel: RAX: ffffffffffffffda RBX: 0000560690f967a0 RCX: 00007f5454a2d91d
Sep 29 08:03:58 fenrir kernel: RDX: 0000000000000001 RSI: 0000000000000084 RDI: 0000560690847fbc
Sep 29 08:03:58 fenrir kernel: RBP: 0000560690f96938 R08: 0000560690847f90 R09: 000000000001a394
Sep 29 08:03:58 fenrir kernel: R10: 000000007fffffff R11: 0000000000000283 R12: 0000000000000e50
Sep 29 08:03:58 fenrir kernel: R13: 0000560690f95a78 R14: 0000560690f95a70 R15: 0000560690f625c0
Sep 29 08:03:58 fenrir kernel: Code: 00 00 55 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57 08 48 89 e5 48 39 c8 74 27 48 b9 00 02 00 00 00 00 ad de 48 39 ca 74 2c <48> 8b 32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 
Sep 29 08:03:58 fenrir kernel: RIP: __list_del_entry_valid+0x29/0x90 RSP: ffffa4fa83883cb0
Sep 29 08:03:58 fenrir kernel: CR2: 0000000000000246
Sep 29 08:03:58 fenrir kernel: ---[ end trace a2a9a3f9d58c176b ]---
Sep 29 08:03:58 fenrir kernel: note: mpv/ao[5781] exited with preempt_count 2

Followed by something i915 related:

Sep 29 08:04:09 fenrir kernel: asynchronous wait on fence i915:gnome-shell[5272]/1:2fb1 timed out
Sep 29 08:04:09 fenrir kernel: pipe A vblank wait timed out
Sep 29 08:04:09 fenrir kernel: ------------[ cut here ]------------
Sep 29 08:04:09 fenrir kernel: WARNING: CPU: 1 PID: 5899 at drivers/gpu/drm/i915/intel_display.c:12172 intel_atomic_commit_tail+0xf7c/0xf90 [i915]
Sep 29 08:04:09 fenrir kernel: Modules linked in: ctr ccm fuse cpufreq_powersave cpufreq_userspace cpufreq_conservative snd_hda_codec_hdmi ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt arc4 nf_conntrack_ipv6 ath9k nf_defrag_ipv6 ath9k_common ipt_REJECT nf_reject_ipv4 ath9k_hw nf_log_ipv4 nf_log_common xt_LOG xt_recent ath xt_limit xt_tcpudp snd_soc_skl mac80211 xt_addrtype snd_soc_skl_ipc snd_hda_codec_realtek snd_hda_codec_generic snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match snd_soc_core intel_rapl snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_hda_codec coretemp kvm_intel snd_hwdep snd_hda_core kvm cfg80211 snd_pcm snd_timer irqbypass snd intel_cstate intel_uncore joydev intel_rapl_perf pcspkr serio_raw sg iTCO_wdt iTCO_vendor_support soundcore rfkill nf_conntrack_ipv4 nf_defrag_ipv4
Sep 29 08:04:09 fenrir kernel:  xt_conntrack shpchp intel_pch_thermal battery ac topstar_laptop sparse_keymap processor_thermal_device evdev intel_soc_dts_iosf int340x_thermal_zone ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack libcrc32c crc32c_generic parport_pc ppdev lp parport iptable_filter ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto btrfs xor zstd_decompress zstd_compress xxhash raid6_pq algif_skcipher af_alg dm_crypt dm_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel i915 ghash_clmulni_intel pcbc video i2c_algo_bit drm_kms_helper i2c_i801 psmouse aesni_intel prime_numbers ahci xhci_pci aes_x86_64 crypto_simd cryptd glue_helper libahci nvme xhci_hcd drm libata nvme_core usbcore scsi_mod button
Sep 29 08:04:09 fenrir kernel: CPU: 1 PID: 5899 Comm: kworker/u8:4 Tainted: G      D W       4.14.0-rc2 #14                              
Sep 29 08:04:09 fenrir kernel: Hardware name: Purism Librem 15 v3/Librem 15 v3, BIOS 4.6-a86d1b-Purism-5 07/27/2017                      
Sep 29 08:04:09 fenrir kernel: Workqueue: events_unbound intel_atomic_commit_work [i915]
Sep 29 08:04:09 fenrir kernel: task: ffff9247cd54f040 task.stack: ffffa4fa831c0000
Sep 29 08:04:09 fenrir kernel: RIP: 0010:intel_atomic_commit_tail+0xf7c/0xf90 [i915]
Sep 29 08:04:09 fenrir kernel: RSP: 0018:ffffa4fa831c3da8 EFLAGS: 00010286
Sep 29 08:04:09 fenrir kernel: RAX: 000000000000001c RBX: 0000000000000000 RCX: 0000000000000000
Sep 29 08:04:09 fenrir kernel: RDX: 0000000000000000 RSI: ffff92483ec8de98 RDI: ffff92483ec8de98
Sep 29 08:04:09 fenrir kernel: RBP: ffffa4fa831c3e60 R08: 0000000000000000 R09: 00000000000002d0
Sep 29 08:04:09 fenrir kernel: R10: ffffa4fa831c3da8 R11: ffffffff8a4d7b4d R12: 000000000001c980
Sep 29 08:04:09 fenrir kernel: R13: ffff924825308000 R14: ffff924825e5e000 R15: 0000000000000001
Sep 29 08:04:09 fenrir kernel: FS:  0000000000000000(0000) GS:ffff92483ec80000(0000) knlGS:0000000000000000                              
Sep 29 08:04:09 fenrir kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 29 08:04:09 fenrir kernel: CR2: 00007f8284686340 CR3: 0000000267235006 CR4: 00000000003606e0
Sep 29 08:04:09 fenrir kernel: Call Trace:
Sep 29 08:04:09 fenrir kernel:  ? finish_wait+0x80/0x80
Sep 29 08:04:09 fenrir kernel:  intel_atomic_commit_work+0x12/0x20 [i915]
Sep 29 08:04:09 fenrir kernel:  process_one_work+0x19f/0x3c0
Sep 29 08:04:09 fenrir kernel:  worker_thread+0x39/0x3c0
Sep 29 08:04:09 fenrir kernel:  kthread+0x125/0x140
Sep 29 08:04:09 fenrir kernel:  ? process_one_work+0x3c0/0x3c0
Sep 29 08:04:09 fenrir kernel:  ? kthread_create_on_node+0x70/0x70
Sep 29 08:04:09 fenrir kernel:  ? kthread_create_on_node+0x70/0x70
Sep 29 08:04:09 fenrir kernel:  ret_from_fork+0x25/0x30
Sep 29 08:04:09 fenrir kernel: Code: ff ff ff 48 83 c7 08 e8 03 af 10 c9 4c 8b 85 70 ff ff ff 4d 85 c0 0f 85 b7 fa ff ff 8d 73 41 48 c7 c7 d8 9a 63 c0 e8 15 60 12 c9 <0f> ff e9 a1 fa ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f             
Sep 29 08:04:09 fenrir kernel: ---[ end trace a2a9a3f9d58c176c ]---

Sep 29 08:04:19 fenrir kernel: NMI watchdog: Watchdog detected hard LOCKUP on cpu 2
Sep 29 08:04:19 fenrir kernel: Modules linked in: ctr ccm fuse cpufreq_powersave cpufreq_userspace cpufreq_conservative snd_hda_codec_hdmi ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt arc4 nf_conntrack_ipv6 ath9k nf_defrag_ipv6 ath9k_common ipt_REJECT nf_reject_ipv4 ath9k_hw nf_log_ipv4 nf_log_common xt_LOG xt_recent ath xt_limit xt_tcpudp snd_soc_skl mac80211 xt_addrtype snd_soc_skl_ipc snd_hda_codec_realtek snd_hda_codec_generic snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match snd_soc_core intel_rapl snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_hda_codec coretemp kvm_intel snd_hwdep snd_hda_core kvm cfg80211 snd_pcm snd_timer irqbypass snd intel_cstate intel_uncore joydev intel_rapl_perf pcspkr serio_raw sg iTCO_wdt iTCO_vendor_support soundcore rfkill nf_conntrack_ipv4 nf_defrag_ipv4
Sep 29 08:04:19 fenrir kernel:  xt_conntrack shpchp intel_pch_thermal battery ac topstar_laptop sparse_keymap processor_thermal_device evdev intel_soc_dts_iosf int340x_thermal_zone ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack libcrc32c crc32c_generic parport_pc ppdev lp parport iptable_filter ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto btrfs xor zstd_decompress zstd_compress xxhash raid6_pq algif_skcipher af_alg dm_crypt dm_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel i915 ghash_clmulni_intel pcbc video i2c_algo_bit drm_kms_helper i2c_i801 psmouse aesni_intel prime_numbers ahci xhci_pci aes_x86_64 crypto_simd cryptd glue_helper libahci nvme xhci_hcd drm libata nvme_core usbcore scsi_mod button
Sep 29 08:04:19 fenrir kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D W       4.14.0-rc2 #14
Sep 29 08:04:19 fenrir kernel: Hardware name: Purism Librem 15 v3/Librem 15 v3, BIOS 4.6-a86d1b-Purism-5 07/27/2017                      
Sep 29 08:04:19 fenrir kernel: task: ffff924833e1b040 task.stack: ffffa4fa80cc4000
Sep 29 08:04:19 fenrir kernel: RIP: 0010:__remove_hrtimer+0x6/0x70
Sep 29 08:04:19 fenrir kernel: RSP: 0018:ffff92483ed03f18 EFLAGS: 00000046
Sep 29 08:04:19 fenrir kernel: RAX: 14e8a8f363b6b333 RBX: ffff92483ed14480 RCX: 0000000000000000
Sep 29 08:04:19 fenrir kernel: RDX: 0000000000000000 RSI: ffff92483ed14500 RDI: ffffa4fa837fbd10
Sep 29 08:04:19 fenrir kernel: RBP: ffff92483ed03f70 R08: 0000000000000101 R09: 0000000000000000
Sep 29 08:04:19 fenrir kernel: R10: 000000000000b6d9 R11: 0000000000000083 R12: ffffa4fa837fbd10
Sep 29 08:04:19 fenrir kernel: R13: ffff92483ed14500 R14: 0000000000000000 R15: ffff92483ed145a8
Sep 29 08:04:19 fenrir kernel: FS:  0000000000000000(0000) GS:ffff92483ed00000(0000) knlGS:0000000000000000                              
Sep 29 08:04:19 fenrir kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 29 08:04:19 fenrir kernel: CR2: 00007f3f13f9c000 CR3: 0000000266a5a004 CR4: 00000000003606e0
Sep 29 08:04:19 fenrir kernel: Call Trace:
Sep 29 08:04:19 fenrir kernel:  <IRQ>
Sep 29 08:04:19 fenrir kernel:  ? __hrtimer_run_queues+0xc3/0x260
Sep 29 08:04:19 fenrir kernel:  hrtimer_interrupt+0xa0/0x1e0
Sep 29 08:04:19 fenrir kernel:  smp_apic_timer_interrupt+0x5f/0x130
Sep 29 08:04:19 fenrir kernel:  apic_timer_interrupt+0x93/0xa0
Sep 29 08:04:19 fenrir kernel:  </IRQ>
Sep 29 08:04:19 fenrir kernel: RIP: 0010:cpuidle_enter_state+0x130/0x2f0
Sep 29 08:04:19 fenrir kernel: RSP: 0018:ffffa4fa80cc7e70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
Sep 29 08:04:19 fenrir kernel: RAX: ffff92483ed1ae80 RBX: 000001c496916739 RCX: 000000000000001f
Sep 29 08:04:19 fenrir kernel: RDX: 000001c496916739 RSI: fffffffdf3fade99 RDI: 0000000000000000
Sep 29 08:04:19 fenrir kernel: RBP: ffffa4fa80cc7eb0 R08: 0000000000000ebe R09: 0000000000000018
Sep 29 08:04:19 fenrir kernel: R10: ffffa4fa80cc7e40 R11: 0000000000000e30 R12: ffffc4fa7fd089a0
Sep 29 08:04:19 fenrir kernel: R13: 0000000000000000 R14: 0000000000000004 R15: ffffffff8a2adf98
Sep 29 08:04:19 fenrir kernel:  cpuidle_enter+0x17/0x20
Sep 29 08:04:19 fenrir kernel:  call_cpuidle+0x23/0x40
Sep 29 08:04:19 fenrir kernel:  do_idle+0x189/0x1e0
Sep 29 08:04:19 fenrir kernel:  cpu_startup_entry+0x73/0x80
Sep 29 08:04:19 fenrir kernel:  start_secondary+0x179/0x1c0
Sep 29 08:04:19 fenrir kernel:  secondary_startup_64+0xa5/0xa5
Sep 29 08:04:19 fenrir kernel: Code: 21 ff ff ff 48 89 df c6 07 00 0f 1f 40 00 65 ff 0d 80 b3 91 76 5b 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 <48> 89 e5 41 56 41 55 41 54 53 0f b6 47 38 4c 8b 36 88 57 38 a8             
Sep 29 08:04:19 fenrir kernel: INFO: rcu_sched self-detected stall on CPU
Sep 29 08:04:19 fenrir kernel:         0-...: (5248 ticks this GP) idle=dfa/140000000000001/0 softirq=60475/60475 fqs=2624               
Sep 29 08:04:19 fenrir kernel:          (t=5250 jiffies g=24728 c=24727 q=1279)
Sep 29 08:04:19 fenrir kernel: NMI backtrace for cpu 0
Sep 29 08:04:19 fenrir kernel: CPU: 0 PID: 5774 Comm: mpv/vo Tainted: G      D W       4.14.0-rc2 #14
Sep 29 08:04:19 fenrir kernel: Hardware name: Purism Librem 15 v3/Librem 15 v3, BIOS 4.6-a86d1b-Purism-5 07/27/2017                      
Sep 29 08:04:19 fenrir kernel: Call Trace:
Sep 29 08:04:19 fenrir kernel:  <IRQ>
Sep 29 08:04:19 fenrir kernel:  dump_stack+0x63/0x82
Sep 29 08:04:19 fenrir kernel:  nmi_cpu_backtrace+0xca/0xd0
Sep 29 08:04:19 fenrir kernel:  ? irq_force_complete_move+0x150/0x150
Sep 29 08:04:19 fenrir kernel:  nmi_trigger_cpumask_backtrace+0x10d/0x140
Sep 29 08:04:19 fenrir kernel:  arch_trigger_cpumask_backtrace+0x19/0x20
Sep 29 08:04:19 fenrir kernel:  rcu_dump_cpu_stacks+0xa3/0xd7
Sep 29 08:04:19 fenrir kernel:  rcu_check_callbacks+0x60a/0x840
Sep 29 08:04:19 fenrir kernel:  ? account_system_index_time+0x63/0x70
Sep 29 08:04:19 fenrir kernel:  ? tick_sched_do_timer+0x50/0x50
Sep 29 08:04:19 fenrir kernel:  update_process_times+0x2f/0x60
Sep 29 08:04:19 fenrir kernel:  tick_sched_handle+0x26/0x70
Sep 29 08:04:19 fenrir kernel:  ? tick_sched_do_timer+0x3f/0x50
Sep 29 08:04:19 fenrir kernel:  tick_sched_timer+0x39/0x80
Sep 29 08:04:19 fenrir kernel:  __hrtimer_run_queues+0xe4/0x260
Sep 29 08:04:19 fenrir kernel:  hrtimer_interrupt+0xa0/0x1e0
Sep 29 08:04:19 fenrir kernel:  smp_apic_timer_interrupt+0x5f/0x130
Sep 29 08:04:19 fenrir kernel:  apic_timer_interrupt+0x93/0xa0
Sep 29 08:04:19 fenrir kernel:  </IRQ>
Sep 29 08:04:19 fenrir kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x135/0x1a0
Sep 29 08:04:19 fenrir kernel: RSP: 0018:ffffa4fa83843c68 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Sep 29 08:04:19 fenrir kernel: RAX: 0000000000000101 RBX: 0000560690847f90 RCX: 0000000000000001
Sep 29 08:04:19 fenrir kernel: RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffffa4fa80d88504
Sep 29 08:04:19 fenrir kernel: RBP: ffffa4fa83843c68 R08: 0000000000000101 R09: 0000000000000000
Sep 29 08:04:19 fenrir kernel: R10: 0000000000000002 R11: ffff9247cd0d0040 R12: ffffa4fa83843d08
Sep 29 08:04:19 fenrir kernel: R13: ffffa4fa83843d58 R14: ffffa4fa83843d90 R15: ffffa4fa80d88500
Sep 29 08:04:19 fenrir kernel:  _raw_spin_lock+0x28/0x30
Sep 29 08:04:19 fenrir kernel:  futex_wait_setup+0x82/0x130
Sep 29 08:04:19 fenrir kernel:  futex_wait+0xed/0x260
Sep 29 08:04:19 fenrir kernel:  ? ___sys_sendmsg+0xa4/0x2e0
Sep 29 08:04:19 fenrir kernel:  do_futex+0x506/0xb10
Sep 29 08:04:19 fenrir kernel:  SyS_futex+0x13b/0x180
Sep 29 08:04:19 fenrir kernel:  entry_SYSCALL_64_fastpath+0x1e/0xa9
Sep 29 08:04:19 fenrir kernel: RIP: 0033:0x7f5454a2ff5c
Sep 29 08:04:19 fenrir kernel: RSP: 002b:00007f54293878e8 EFLAGS: 00000202 ORIG_RAX: 00000000000000ca
Sep 29 08:04:19 fenrir kernel: RAX: ffffffffffffffda RBX: 00000000001c2000 RCX: 00007f5454a2ff5c
Sep 29 08:04:19 fenrir kernel: RDX: 0000000000000002 RSI: 0000000000000080 RDI: 0000560690847f90
Sep 29 08:04:19 fenrir kernel: RBP: 00007f5429387338 R08: 0000560690847f90 R09: 0000000000009e82
Sep 29 08:04:19 fenrir kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000001c2000
Sep 29 08:04:19 fenrir kernel: R13: 00007f53ecb7fb20 R14: 00007f5429387338 R15: 0000000000000002
Sep 29 08:04:19 fenrir kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:36:pipe A] flip_done timed out 

And more soft lockups:

Sep 29 08:04:46 fenrir kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [mpv/vo:5774]
Sep 29 08:04:46 fenrir kernel: Modules linked in: ctr ccm fuse cpufreq_powersave cpufreq_userspace cpufreq_conservative snd_hda_codec_hdmi ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt arc4 nf_conntrack_ipv6 ath9k nf_defrag_ipv6 ath9k_common ipt_REJECT nf_reject_ipv4 ath9k_hw nf_log_ipv4 nf_log_common xt_LOG xt_recent ath xt_limit xt_tcpudp snd_soc_skl mac80211 xt_addrtype snd_soc_skl_ipc snd_hda_codec_realtek snd_hda_codec_generic snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match snd_soc_core intel_rapl snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_hda_codec coretemp kvm_intel snd_hwdep snd_hda_core kvm cfg80211 snd_pcm snd_timer irqbypass snd intel_cstate intel_uncore joydev intel_rapl_perf pcspkr serio_raw sg iTCO_wdt iTCO_vendor_support soundcore rfkill nf_conntrack_ipv4 nf_defrag_ipv4
Sep 29 08:04:46 fenrir kernel:  xt_conntrack shpchp intel_pch_thermal battery ac topstar_laptop sparse_keymap processor_thermal_device evdev intel_soc_dts_iosf int340x_thermal_zone ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack libcrc32c crc32c_generic parport_pc ppdev lp parport iptable_filter ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto btrfs xor zstd_decompress zstd_compress xxhash raid6_pq algif_skcipher af_alg dm_crypt dm_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel i915 ghash_clmulni_intel pcbc video i2c_algo_bit drm_kms_helper i2c_i801 psmouse aesni_intel prime_numbers ahci xhci_pci aes_x86_64 crypto_simd cryptd glue_helper libahci nvme xhci_hcd drm libata nvme_core usbcore scsi_mod button
Sep 29 08:04:46 fenrir kernel: CPU: 0 PID: 5774 Comm: mpv/vo Tainted: G      D W       4.14.0-rc2 #14
Sep 29 08:04:46 fenrir kernel: Hardware name: Purism Librem 15 v3/Librem 15 v3, BIOS 4.6-a86d1b-Purism-5 07/27/2017                      
Sep 29 08:04:46 fenrir kernel: task: ffff9247cd0d0040 task.stack: ffffa4fa83840000
Sep 29 08:04:46 fenrir kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x135/0x1a0
Sep 29 08:04:46 fenrir kernel: RSP: 0018:ffffa4fa83843c68 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Sep 29 08:04:46 fenrir kernel: RAX: 0000000000000101 RBX: 0000560690847f90 RCX: 0000000000000001
Sep 29 08:04:46 fenrir kernel: RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffffa4fa80d88504
Sep 29 08:04:46 fenrir kernel: RBP: ffffa4fa83843c68 R08: 0000000000000101 R09: 0000000000000000
Sep 29 08:04:46 fenrir kernel: R10: 0000000000000002 R11: ffff9247cd0d0040 R12: ffffa4fa83843d08
Sep 29 08:04:46 fenrir kernel: R13: ffffa4fa83843d58 R14: ffffa4fa83843d90 R15: ffffa4fa80d88500
Sep 29 08:04:46 fenrir kernel: FS:  00007f5429388700(0000) GS:ffff92483ec00000(0000) knlGS:0000000000000000                              
Sep 29 08:04:46 fenrir kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 29 08:04:46 fenrir kernel: CR2: 00007f8284498510 CR3: 000000026c196004 CR4: 00000000003606f0
Sep 29 08:04:46 fenrir kernel: Call Trace:
Sep 29 08:04:46 fenrir kernel:  _raw_spin_lock+0x28/0x30
Sep 29 08:04:46 fenrir kernel:  futex_wait_setup+0x82/0x130
Sep 29 08:04:46 fenrir kernel:  futex_wait+0xed/0x260
Sep 29 08:04:46 fenrir kernel:  ? ___sys_sendmsg+0xa4/0x2e0
Sep 29 08:04:46 fenrir kernel:  do_futex+0x506/0xb10
Sep 29 08:04:46 fenrir kernel:  SyS_futex+0x13b/0x180
Sep 29 08:04:46 fenrir kernel:  entry_SYSCALL_64_fastpath+0x1e/0xa9
Sep 29 08:04:46 fenrir kernel: RIP: 0033:0x7f5454a2ff5c
Sep 29 08:04:46 fenrir kernel: RSP: 002b:00007f54293878e8 EFLAGS: 00000202 ORIG_RAX: 00000000000000ca
Sep 29 08:04:46 fenrir kernel: RAX: ffffffffffffffda RBX: 00000000001c2000 RCX: 00007f5454a2ff5c
Sep 29 08:04:46 fenrir kernel: RDX: 0000000000000002 RSI: 0000000000000080 RDI: 0000560690847f90
Sep 29 08:04:46 fenrir kernel: RBP: 00007f5429387338 R08: 0000560690847f90 R09: 0000000000009e82
Sep 29 08:04:46 fenrir kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000001c2000
Sep 29 08:04:46 fenrir kernel: R13: 00007f53ecb7fb20 R14: 00007f5429387338 R15: 0000000000000002
Sep 29 08:04:46 fenrir kernel: Code: 66 31 c0 41 39 c0 74 ea 4d 85 c9 c6 07 01 74 2d 41 c7 41 08 01 00 00 00 eb 96 83 fa 01 0f 84 f4 fe ff ff 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 5d c3 f3 90 4c 8b 09             


Sep 29 08:05:14 fenrir kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [mpv/vo:5774]
Sep 29 08:05:14 fenrir kernel: Modules linked in: ctr ccm fuse cpufreq_powersave cpufreq_userspace cpufreq_conservative snd_hda_codec_hdmi ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt arc4 nf_conntrack_ipv6 ath9k nf_defrag_ipv6 ath9k_common ipt_REJECT nf_reject_ipv4 ath9k_hw nf_log_ipv4 nf_log_common xt_LOG xt_recent ath xt_limit xt_tcpudp snd_soc_skl mac80211 xt_addrtype snd_soc_skl_ipc snd_hda_codec_realtek snd_hda_codec_generic snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match snd_soc_core intel_rapl snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_hda_codec coretemp kvm_intel snd_hwdep snd_hda_core kvm cfg80211 snd_pcm snd_timer irqbypass snd intel_cstate intel_uncore joydev intel_rapl_perf pcspkr serio_raw sg iTCO_wdt iTCO_vendor_support soundcore rfkill nf_conntrack_ipv4 nf_defrag_ipv4
Sep 29 08:05:14 fenrir kernel:  xt_conntrack shpchp intel_pch_thermal battery ac topstar_laptop sparse_keymap processor_thermal_device evdev intel_soc_dts_iosf int340x_thermal_zone ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack libcrc32c crc32c_generic parport_pc ppdev lp parport iptable_filter ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto btrfs xor zstd_decompress zstd_compress xxhash raid6_pq algif_skcipher af_alg dm_crypt dm_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel i915 ghash_clmulni_intel pcbc video i2c_algo_bit drm_kms_helper i2c_i801 psmouse aesni_intel prime_numbers ahci xhci_pci aes_x86_64 crypto_simd cryptd glue_helper libahci nvme xhci_hcd drm libata nvme_core usbcore scsi_mod button
Sep 29 08:05:14 fenrir kernel: CPU: 0 PID: 5774 Comm: mpv/vo Tainted: G      D W    L  4.14.0-rc2 #14
Sep 29 08:05:14 fenrir kernel: Hardware name: Purism Librem 15 v3/Librem 15 v3, BIOS 4.6-a86d1b-Purism-5 07/27/2017                      
Sep 29 08:05:14 fenrir kernel: task: ffff9247cd0d0040 task.stack: ffffa4fa83840000
Sep 29 08:05:14 fenrir kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x135/0x1a0
Sep 29 08:05:14 fenrir kernel: RSP: 0018:ffffa4fa83843c68 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Sep 29 08:05:14 fenrir kernel: RAX: 0000000000000101 RBX: 0000560690847f90 RCX: 0000000000000001
Sep 29 08:05:14 fenrir kernel: RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffffa4fa80d88504
Sep 29 08:05:14 fenrir kernel: RBP: ffffa4fa83843c68 R08: 0000000000000101 R09: 0000000000000000
Sep 29 08:05:14 fenrir kernel: R10: 0000000000000002 R11: ffff9247cd0d0040 R12: ffffa4fa83843d08
Sep 29 08:05:14 fenrir kernel: R13: ffffa4fa83843d58 R14: ffffa4fa83843d90 R15: ffffa4fa80d88500
Sep 29 08:05:14 fenrir kernel: FS:  00007f5429388700(0000) GS:ffff92483ec00000(0000) knlGS:0000000000000000                              
Sep 29 08:05:14 fenrir kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 29 08:05:14 fenrir kernel: CR2: 00007f8284498510 CR3: 000000026c196004 CR4: 00000000003606f0
Sep 29 08:05:14 fenrir kernel: Call Trace:
Sep 29 08:05:14 fenrir kernel:  _raw_spin_lock+0x28/0x30
Sep 29 08:05:14 fenrir kernel:  futex_wait_setup+0x82/0x130
Sep 29 08:05:14 fenrir kernel:  futex_wait+0xed/0x260
Sep 29 08:05:14 fenrir kernel:  ? ___sys_sendmsg+0xa4/0x2e0
Sep 29 08:05:14 fenrir kernel:  do_futex+0x506/0xb10
Sep 29 08:05:14 fenrir kernel:  SyS_futex+0x13b/0x180
Sep 29 08:05:14 fenrir kernel:  entry_SYSCALL_64_fastpath+0x1e/0xa9
Sep 29 08:05:14 fenrir kernel: RIP: 0033:0x7f5454a2ff5c
Sep 29 08:05:14 fenrir kernel: RSP: 002b:00007f54293878e8 EFLAGS: 00000202 ORIG_RAX: 00000000000000ca
Sep 29 08:05:14 fenrir kernel: RAX: ffffffffffffffda RBX: 00000000001c2000 RCX: 00007f5454a2ff5c
Sep 29 08:05:14 fenrir kernel: RDX: 0000000000000002 RSI: 0000000000000080 RDI: 0000560690847f90
Sep 29 08:05:14 fenrir kernel: RBP: 00007f5429387338 R08: 0000560690847f90 R09: 0000000000009e82
Sep 29 08:05:14 fenrir kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000001c2000
Sep 29 08:05:14 fenrir kernel: R13: 00007f53ecb7fb20 R14: 00007f5429387338 R15: 0000000000000002
Sep 29 08:05:14 fenrir kernel: Code: 66 31 c0 41 39 c0 74 ea 4d 85 c9 c6 07 01 74 2d 41 c7 41 08 01 00 00 00 eb 96 83 fa 01 0f 84 f4 fe ff ff 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 5d c3 f3 90 4c 8b 09             

Sep 29 08:05:22 fenrir kernel: INFO: rcu_sched self-detected stall on CPU
Sep 29 08:05:22 fenrir kernel:         0-...: (21001 ticks this GP) idle=dfa/140000000000001/0 softirq=60475/60475 fqs=10501             
Sep 29 08:05:22 fenrir kernel:          (t=21003 jiffies g=24728 c=24727 q=1887)
Sep 29 08:05:22 fenrir kernel: NMI backtrace for cpu 0
Sep 29 08:05:22 fenrir kernel: CPU: 0 PID: 5774 Comm: mpv/vo Tainted: G      D W    L  4.14.0-rc2 #14
Sep 29 08:05:22 fenrir kernel: Hardware name: Purism Librem 15 v3/Librem 15 v3, BIOS 4.6-a86d1b-Purism-5 07/27/2017                      
Sep 29 08:05:22 fenrir kernel: Call Trace:
Sep 29 08:05:22 fenrir kernel:  <IRQ>
Sep 29 08:05:22 fenrir kernel:  dump_stack+0x63/0x82
Sep 29 08:05:22 fenrir kernel:  nmi_cpu_backtrace+0xca/0xd0
Sep 29 08:05:22 fenrir kernel:  ? irq_force_complete_move+0x150/0x150
Sep 29 08:05:22 fenrir kernel:  nmi_trigger_cpumask_backtrace+0x10d/0x140
Sep 29 08:05:22 fenrir kernel:  arch_trigger_cpumask_backtrace+0x19/0x20
Sep 29 08:05:22 fenrir kernel:  rcu_dump_cpu_stacks+0xa3/0xd7
Sep 29 08:05:22 fenrir kernel:  rcu_check_callbacks+0x60a/0x840
Sep 29 08:05:22 fenrir kernel:  ? account_system_index_time+0x63/0x70
Sep 29 08:05:22 fenrir kernel:  ? tick_sched_do_timer+0x50/0x50
Sep 29 08:05:22 fenrir kernel:  update_process_times+0x2f/0x60
Sep 29 08:05:22 fenrir kernel:  tick_sched_handle+0x26/0x70
Sep 29 08:05:22 fenrir kernel:  ? tick_sched_do_timer+0x3f/0x50
Sep 29 08:05:22 fenrir kernel:  tick_sched_timer+0x39/0x80
Sep 29 08:05:22 fenrir kernel:  __hrtimer_run_queues+0xe4/0x260
Sep 29 08:05:22 fenrir kernel:  hrtimer_interrupt+0xa0/0x1e0
Sep 29 08:05:22 fenrir kernel:  smp_apic_timer_interrupt+0x5f/0x130
Sep 29 08:05:22 fenrir kernel:  apic_timer_interrupt+0x93/0xa0
Sep 29 08:05:22 fenrir kernel:  </IRQ>
Sep 29 08:05:22 fenrir kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x137/0x1a0
Sep 29 08:05:22 fenrir kernel: RSP: 0018:ffffa4fa83843c68 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff10
Sep 29 08:05:22 fenrir kernel: RAX: 0000000000000101 RBX: 0000560690847f90 RCX: 0000000000000001
Sep 29 08:05:22 fenrir kernel: RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffffa4fa80d88504
Sep 29 08:05:22 fenrir kernel: RBP: ffffa4fa83843c68 R08: 0000000000000101 R09: 0000000000000000
Sep 29 08:05:22 fenrir kernel: R10: 0000000000000002 R11: ffff9247cd0d0040 R12: ffffa4fa83843d08
Sep 29 08:05:22 fenrir kernel: R13: ffffa4fa83843d58 R14: ffffa4fa83843d90 R15: ffffa4fa80d88500
Sep 29 08:05:22 fenrir kernel:  _raw_spin_lock+0x28/0x30
Sep 29 08:05:22 fenrir kernel:  futex_wait_setup+0x82/0x130
Sep 29 08:05:22 fenrir kernel:  futex_wait+0xed/0x260
Sep 29 08:05:22 fenrir kernel:  ? ___sys_sendmsg+0xa4/0x2e0
Sep 29 08:05:22 fenrir kernel:  do_futex+0x506/0xb10
Sep 29 08:05:22 fenrir kernel:  SyS_futex+0x13b/0x180
Sep 29 08:05:22 fenrir kernel:  entry_SYSCALL_64_fastpath+0x1e/0xa9
Sep 29 08:05:22 fenrir kernel: RIP: 0033:0x7f5454a2ff5c
Sep 29 08:05:22 fenrir kernel: RSP: 002b:00007f54293878e8 EFLAGS: 00000202 ORIG_RAX: 00000000000000ca
Sep 29 08:05:22 fenrir kernel: RAX: ffffffffffffffda RBX: 00000000001c2000 RCX: 00007f5454a2ff5c
Sep 29 08:05:22 fenrir kernel: RDX: 0000000000000002 RSI: 0000000000000080 RDI: 0000560690847f90
Sep 29 08:05:22 fenrir kernel: RBP: 00007f5429387338 R08: 0000560690847f90 R09: 0000000000009e82
Sep 29 08:05:22 fenrir kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000001c2000
Sep 29 08:05:22 fenrir kernel: R13: 00007f53ecb7fb20 R14: 00007f5429387338 R15: 0000000000000002

etc etc.

FWIW, I’ve also tried X11 with Option “DRI” “False” on the card (so it shouldn’t be using any acceleration feature?!), I’m really surprised that didn’t help.

Hopefully the intel-graphics list will have an idea what this is about…

EDIT: Also found out about /sys/class/drm/card0/error thanks to that bug you pointed at, I’ll try to see if I can have a look at it before ssh stops responding next time as well. It’s tricky to get traces on this stuff…

1 Like

Uh, mpv just crashed with -vo null as well… So we’re actually looking at some weird CPU microcode issue? Go figure what ffmpeg does do decode h264 :confused:

No big crash this time, just mpv segfaults - but it doesn’t segfault playing the very same video with the very same software if I only have one mpv instance running, or with cpufreq governor set to performance. I’ll retry that one more time tomorrow to double check.
I guess I can give up on intel-graphics though, let’s try the intel community “processor” forum instead… There were other threads about freezes a while back, although prime95 with these settings didn’t crash at all here there might have been another problem…

EDIT: Created a post over there, let’s see: https://communities.intel.com/thread/118352

1 Like

I experienced a major freeze today. I had firefox opened with 2 tabs running live streams as well as chrome browser opened with another live stream video. I am going to try to reproduce the crash and report back here.

I’ve had three freezes today alone!
The first one unlocked itself after waiting ~10 minutes but shortly thereafter it froze again and after 20 minutes I did a hard reboot.

My suspicion is that it is graphics related (wayland?) but I don’t know how to confirm or deny.

Yeah this is annoying :confused:

I was also thinking graphics at first but I was able to get crashes without any graphics (removed all graphics drivers, rebooted in single mode and just ran mpv - decoding and throwing the result away - and got it to crash as it does with wayland/gnome running).

I’m fairly sure it has to do with frequency changes - if you want a quick workaround you can run either of these commands:

# disable 3/4 of the CPUs and set the policy to powersave
sudo sh -c 'for f in /sys/devices/system/cpu/cpu[123]/online; do echo 0 > $f; done; echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor'

# or -- renable/leave all 4 CPUs online, but set them to performance
sudo sh -c 'for cpu in /sys/devices/system/cpu/cpu[123]; do echo 1 > $cpu/online; echo performance > $cpu/cpufreq/scaling_governor; done; echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Both of these are working rather well for me atm. I usually stick to one CPU and it honestly is good enough for most of what I’m doing, and it’s simple enough to switch whenever I need to.
I would have wanted to try something else (e.g. limit the maximum frequency) but it seems the scaling_max_freq tuning does not look like it is working…

My current hunch at the moment is that there is a problem with frequent frequency changes regarding power consumption. graphical/video decoding (vectorial operations) are power hungry, and I noticed I only get crash if the CPUs are “half used” (and looking at stats I would see the frequency swing from 1 to 3GHz all the time).
If there is a sudden tension drop at the CPU’s input voltage there is no shortage of bad behavior we can observe, but I’m not too sure how to confirm that.
Talking with @mladen and the support team about how to help investigate through mails…

4 Likes