That is a fair summary of partition vs file, but there are a few other considerations.
Minor Notes
First is filesystem support. If you want a swapfile on BTRFS before the 5.0 kernel, you must use a loopback device. With 5.0+, you can skip the loopback, but must set the file noCOW.
Second, as a small note, the Linux kernel allows (and defaults to) memory overcommit, which means that swap is not needed when a program requests more memory than is available. On most modern systems (Linux included), the if (some_malloced_ptr==null)
is never true, as malloc
(and similar) don’t actually allocate any memory. They just ask the kernel for a virtual page, effectively a unique address space to put stuff in. It’s only when a program tries to actually write to a page returned from malloc
that the kernel actually commits memory to the process (one page at a time). If there isn’t a page already allocated for the process, the kernel allocates one. If it doesn’t have one to allocate, then it dumps disk cache, pushes idle pages to swap, or triggers the out-of-memory killer (OOMK). This actually means that you can be totally “proper” with your programming, verify that malloc
doesn’t return null
, and still get your program killed by the OOMK (there’s a way to control that, but that’s a little far off topic). Note that there’s also a way to force your program to actually get the memory it requests, typically via mlock
.
ZRam
Anyway, after all of that, the “Swap file vs swap partition” question is almost always best answered by “neither”. Because modern Linux uses memory overcommit, it’s quite common for a memory pressure issue to manifest itself suddenly and severely. Also, even modern SSDs are 2 orders of magnitude slower than memory access (nvme drives are about 1 order of magnitude slower). Given that everything outside the kernel, including init
, and your window system will be entering into the memory contention fray, performance will quickly grind to a halt if you have a decently provisioned system and end up needing swap (even just getting low enough on memory that you have to dump most of your disk cache is painful).
This is where zram comes in. It provides virtual compressed block devices, which can be used for swap (or for tmpfs). In terms of speed, it depends on how much CPU power you have available, but is usually about on-par with NVME storage for desktops and about 1/4 to 1/2 that speed for old laptops. It generally averages 3:1 compression ratio. It should be obvious that it does not work for hibernation.
The zram-init package makes using zram for a basic configuration easy. If you want to get fancier with it, you can actually give zram a backing device (a partition or a loop device, no backing files) and it will write compressed pages to the backing device. (Only catch is you have to tell it when to do so, as there isn’t idle page support).
CGroups
In any case, using swap or using zram, if you have a logical swap enabled, you need to set up a memory cgroup. Whatever is likely to cause a low memory condition should be isolated inside a cgroup. If you are lazy, just put everything that isn’t bash on tty1 into a memory cgroup. Then limit the memory cgroup to say 100MB less than your physical memory. That way, if you accidentally memory bind your computer, you can switch to vt1, log in and clean up whatever the issue is.
If you know what is likely to cause issues (in my case, large computational fluid dynamic simulations), you can stick just those in a cgroup, and the machine can go to swapping hard without you having your user experience degrade.
A final aside
If you want to see the overcommit in action, for a cool trick, check out https://vector-of-bool.github.io/2018/11/06/dumbest-allocator.html