Swap file, or swap partition?

DHS · October 15, 2020, 10:06pm

I’m about to do a PureOS installation (onto a laptop with SSD). Do I need a swap partition, or would a swap file work just as well?
As far as I know, swap is useful for:

Avoiding problems when a program requests more memory than is available; Both swapfiles and swap partitions work for this.
Hibernate (suspend to disk); Requires a swap partition – hibernate using a swapfile is experimental only.
The second use of swap is mostly moot in PureOS 9 and 10, since there isn’t a simple way to implement the ability to hibernate; See https://tracker.pureos.net/T753.

The main advantage of a swapfile is flexibility: It’s relatively easy to change its size.
The main advantages of a swap partition are faster performance (but this only applies to rotational media, not SSDs, and anyway there’s little difference anymore in Linux 2.6+) and greater reliability (also less relevant on modern systems).
Any important issues I missed?

There’s a little bit of discussion and instructions here, for Ubuntu:
https://askubuntu.com/questions/904372/swap-partition-vs-swap-file

kieran · October 16, 2020, 1:20am

That’s a fair summary.

Honestly, if you don’t need hibernate then if your computer is generously provided for with RAM then you might not need swap at all.

As you say, setting up a swap partition is inflexible, particularly if it is a partition of the boot disk. (I have a couple of computers where the boot disk is an SSD but there is a second disk, a rotating disk, from which I have carved out a partition for the swap, but the swap is never used, so the second disk spins down, and it is a bit more flexible if I want to change things around.)

One other consideration is encryption. If you are encrypting your swap (and you should be if you have it at all) then make sure that your chosen option (file or partition) supports encryption.

lperkins2 · October 16, 2020, 2:57am

That is a fair summary of partition vs file, but there are a few other considerations.

Minor Notes

First is filesystem support. If you want a swapfile on BTRFS before the 5.0 kernel, you must use a loopback device. With 5.0+, you can skip the loopback, but must set the file noCOW.

Second, as a small note, the Linux kernel allows (and defaults to) memory overcommit, which means that swap is not needed when a program requests more memory than is available. On most modern systems (Linux included), the if (some_malloced_ptr==null) is never true, as malloc (and similar) don’t actually allocate any memory. They just ask the kernel for a virtual page, effectively a unique address space to put stuff in. It’s only when a program tries to actually write to a page returned from malloc that the kernel actually commits memory to the process (one page at a time). If there isn’t a page already allocated for the process, the kernel allocates one. If it doesn’t have one to allocate, then it dumps disk cache, pushes idle pages to swap, or triggers the out-of-memory killer (OOMK). This actually means that you can be totally “proper” with your programming, verify that malloc doesn’t return null, and still get your program killed by the OOMK (there’s a way to control that, but that’s a little far off topic). Note that there’s also a way to force your program to actually get the memory it requests, typically via mlock.

ZRam

Anyway, after all of that, the “Swap file vs swap partition” question is almost always best answered by “neither”. Because modern Linux uses memory overcommit, it’s quite common for a memory pressure issue to manifest itself suddenly and severely. Also, even modern SSDs are 2 orders of magnitude slower than memory access (nvme drives are about 1 order of magnitude slower). Given that everything outside the kernel, including init, and your window system will be entering into the memory contention fray, performance will quickly grind to a halt if you have a decently provisioned system and end up needing swap (even just getting low enough on memory that you have to dump most of your disk cache is painful).

This is where zram comes in. It provides virtual compressed block devices, which can be used for swap (or for tmpfs). In terms of speed, it depends on how much CPU power you have available, but is usually about on-par with NVME storage for desktops and about 1/4 to 1/2 that speed for old laptops. It generally averages 3:1 compression ratio. It should be obvious that it does not work for hibernation.

The zram-init package makes using zram for a basic configuration easy. If you want to get fancier with it, you can actually give zram a backing device (a partition or a loop device, no backing files) and it will write compressed pages to the backing device. (Only catch is you have to tell it when to do so, as there isn’t idle page support).

CGroups

In any case, using swap or using zram, if you have a logical swap enabled, you need to set up a memory cgroup. Whatever is likely to cause a low memory condition should be isolated inside a cgroup. If you are lazy, just put everything that isn’t bash on tty1 into a memory cgroup. Then limit the memory cgroup to say 100MB less than your physical memory. That way, if you accidentally memory bind your computer, you can switch to vt1, log in and clean up whatever the issue is.

If you know what is likely to cause issues (in my case, large computational fluid dynamic simulations), you can stick just those in a cgroup, and the machine can go to swapping hard without you having your user experience degrade.

A final aside

If you want to see the overcommit in action, for a cool trick, check out https://vector-of-bool.github.io/2018/11/06/dumbest-allocator.html

2disbetter · October 16, 2020, 6:53am

Thank you all for these comments. I was recently talking about some of this on the Matrix channels, and I find all of this very useful and helpful information.

tracy · October 16, 2020, 1:22pm

Ah the good old days when memory was so short, even a swap to 256K to a single cylinder of a 20MB disk pack was worth it.