My server has 8Gigs of RAM and 8Gigs configured for swap file. I have memory intensive apps running. These apps have peak loads during which we find swap usage increase. Approximately 1 GIG of swap is used.
I have another server with 4Gigs of RAM and 8 Gigs of swap and similar memory intensive apps running on it. But here swap usage is very negligible. Around 100 MB.
I was wondering what are the exact conditions or a rough formula based on which Linux will do a swapout of a process memory in RAM to the swap file. I know its based on swapiness factor. What else is it based on? Swap file size? Any pointers to Linux kernel documentation/source code explaining this will be great.
Swap space in Linux is used when the amount of physical memory (RAM) is full. If the system needs more memory resources and the RAM is full, inactive pages in memory are moved to the swap space. While swap space can help machines with a small amount of RAM, it should not be considered a replacement for more RAM.
Memory swapping is a memory reclamation method wherein memory contents not currently in use are swapped to a disk to make the memory available for other applications or processes. The exact state or "page" of memory is copied to the disk to make the data contiguous and easy to restore later.
Swappiness refers to the kernel parameter responsible for how much and how often that the system moves data from RAM to swap memory. The default value for swappiness is 60; however, you can manually set it anywhere between 0-100.
There are two options when it comes to creating a swap space. You can create a swap partition or a swap file. Most Linux installations come preallocated with a swap partition. This is a dedicated block of memory on the hard disk utilized when the physical RAM is full.
I've seen a lot of people posting subjective explanations of what this does. Here is hopefully a more full answer.
In the split LRU on post 2.6.28 Linux swappiness is a multiplier used to arbitrarily modify the fraction that is calculated determining the pressure built up in both LRUs.
So, for example on a system with no free memory left - the value of the existing memory you have is measured based off of the rate of how much memory is listed as 'Active' and the rate of how often pages are promoted to active after falling into the inactive list.
An LRU with many promotions/demotions of pages between active and inactive is in a lot of use.
Typically file backed storage is cheaper and safer to evict when your running out of memory and automatically is given a modifier of 200 (this makes file backed memory 200 times more worthless than swap backed memory (Which has a value of 0) when it multiplies this fraction.
What swappiness does is modify this value by deducting the swappiness number you gave (default 60) to file memory and adding the swappiness value you gave as a multiplier to anon memory. Thus the default swappiness leaves you with anonymous memory being 80 times more valuable than file memory (200-60 for file, 0+60 for anon). Thus, on a typical linux system that has used up all its memory, page cache would have to be 80 TIMES more active than anonymous memory for anonymous memory to be swapped out in favour of page cache.
If you set swappiness to 100 this gives anon a modifier of 100 and file memory a modifier of 100 (200 - 100) leaving both LRUs equally weighted. Thus on a file heavy system that wants page cache providing the anon memory is not as active as page cache then anon memory will be swapped to disk to make space for extra page cache.
Linux (or any other OS) divides memory up into pages (typically 4Kb). Each of these pages represent a chunk of memory. Usage information for these pages is maintained, which basically contains info about whether the page is free or is in use (part of some process), whether it has been accessed recently, what kind of data it contains (process data, executable code etc.), owner of the page, etc. These pages can also be broadly divided into two categories - filesystem pages or the page cache (in which all data read/written to your filesystem resides) and pages belonging to processes.
When the system is running low on memory, the kernel starts swapping out pages based on their usage. Using a list of pages sorted w.r.t recency of access is common for determining which pages can be swapped out (linux kernel has such a list too).
During swapping, Linux kernel needs to decide what to trade-off when nuking pages in memory and sending them to swap. If it swaps filesystem pages too aggressively, more reads are required from the filesystem to read those pages back when they are needed. However, if it swaps out process pages more aggressively it can hurt interactivity, because when the user tries to use the swapped out processes, they will have to be read back from the disk. See a nice discussion here on this.
By setting swappiness = 0, you are telling the linux kernel not to swap out pages belonging to processes. When setting swappiness = 100 instead, you tell the kernel to swap out pages belonging to processes more aggressively. To tune your system, try changing the swappiness parameter in steps of 10, monitoring performance and pages being swapped in/out at each setting using the "vmstat" command. Keep the setting that gives you the best results. Remember to do this testing during peak usage hours. :)
For database applications, swappiness = 0 is generally recommended. (Even then, test different settings on your systems to arrive to a good value).
References:
http://www.linuxvox.com/2009/10/what-is-the-linux-kernel-parameter-vm-swappiness/ 
http://www.pythian.com/news/1913/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With