Awhile back, we benchmark our Proxmox infrastructure in various ways mainly to
experiment and to get the maximum out of the hardware we had. We did write two
articles on the subject to share our result and conclusion, but we did not take
the time to share all our result. In an initiative to continue the series, this
article will share our finding regarding a ZFS tuning parameter that has a good
enough impact on your Proxmox infrastructure. The parameters in question is the
primarycache
option. It’s not available in the Proxmox GUI. You must use CLI
to change the value and you may configure it per ZFS volume.
Here what the ZFS manual have to say about this option:
primarycache=all | none | metadata
Controls what is cached in the primary cache (ARC). If this property is set
to all, then both user data and metadata is cached. If this property is set
to none, then neither user data nor metadata is cached. If this property is
set to metadata, then only metadata is cached. The default value is all.
Out of this description, one would think caching is better and we must enable it. Wrong. In virtual machine, if you give it enough memory, the guest OS is already caching the file system data. The guest OS can also make better decision regarding what need to be cached since it’s closer to the application. Effectively, enabling the ZFS primarycache for virtual machine is not useful because it creates two caches, one in the guest OS and another in the host OS. With this solution, it’s highly possible to have the same data stored twice in memory. People may argue, the ARC (adaptive replacement cache) as better algorithms for caching, but it’s a waste, because the guest OS doesn’t have direct access to the ARC.
As for LXC, it’s a bit different. LXC does have direct access to the ARC. The
performance boost provided by the primarycache highly depends on your workload.
One would think primarycache=all
for LXC should be beneficial. With our
benchmark we observe different results. To check if the
primarycache=all
provide benefit for your workload, the best it to test it
and use various ARC statistics to verify if the ARC is in fact used or not.
Have a look at: /usr/sbin/arcstat.py
and /usr/sbin/arc_summary.py
.
To change this option, you must identify the right zvol to be updated.
$ sudo zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 111G 369G 140K /rpool
rpool/ROOT 30.9G 369G 140K /rpool/ROOT
rpool/ROOT/pve-1 30.9G 369G 30.9G /
rpool/data 70.4G 369G 6.98G /rpool/data
rpool/data/subvol-116-disk-1 2.14G 5.86G 2.14G /rpool/data/subvol-116-disk-1
rpool/data/subvol-117-disk-1 2.15G 5.85G 2.15G /rpool/data/subvol-117-disk-1
rpool/data/subvol-120-disk-1 2.14G 5.86G 2.14G /rpool/data/subvol-120-disk-1
rpool/data/subvol-125-disk-1 3.37G 28.6G 3.37G /rpool/data/subvol-125-disk-1
rpool/data/vm-112-disk-1 21.5G 369G 21.0G -
rpool/data/vm-114-disk-1 8.02G 369G 8.02G -
rpool/data/vm-119-disk-1 16.9G 369G 16.4G -
rpool/data/vm-121-disk-1 1.96G 369G 1.96G -
rpool/data/vm-121-disk-2 5.57M 369G 5.57M -
rpool/data/vm-122-disk-1 1.45G 369G 1.45G -
rpool/data/vm-123-disk-1 1.46G 369G 1.46G -
rpool/data/vm-124-disk-1 1.46G 369G 1.46G -
rpool/subvol-108-disk-1 1.03G 7.13G 893M /rpool/subvol-108-disk-1
rpool/swap 8.50G 375G 2.77G -
In our environment, rpool/data is our storage for Proxmox virtual machine and LXC. If you want to change this option for all your environment, you may set the option on it. Otherwise, you may choose to only change the option on a specific VM by changing the value for a specific zvol.
$ sudo zfs get primarycache
NAME PROPERTY VALUE SOURCE
rpool primarycache all default
rpool/ROOT primarycache all default
rpool/ROOT/pve-1 primarycache all default
rpool/data primarycache metadata local
rpool/data/subvol-116-disk-1 primarycache metadata inherited from rpool/data
rpool/data/subvol-117-disk-1 primarycache metadata inherited from rpool/data
rpool/data/subvol-120-disk-1 primarycache metadata inherited from rpool/data
rpool/data/subvol-125-disk-1 primarycache all local
rpool/data/vm-112-disk-1 primarycache metadata inherited from rpool/data
rpool/data/vm-114-disk-1 primarycache metadata inherited from rpool/data
rpool/data/vm-119-disk-1 primarycache metadata inherited from rpool/data
rpool/data/vm-121-disk-1 primarycache all local
rpool/data/vm-121-disk-2 primarycache metadata inherited from rpool/data
rpool/data/vm-122-disk-1 primarycache metadata inherited from rpool/data
rpool/data/vm-123-disk-1 primarycache metadata inherited from rpool/data
rpool/data/vm-124-disk-1 primarycache metadata inherited from rpool/data
rpool/subvol-108-disk-1 primarycache all default
rpool/swap primarycache metadata local
sudo zfs set primarycache=metadata rpool/data/vm-112-disk-1
FS-Mark - 1000 Files, 1MB Size
( More is better)
With this test we don't see a big different between the two options. Still it's enough to showcase the benefit of using primarycache=metadata for LXC.
Threaded I/O Tester - 64MB Random Read - 32 Threads
( More is better)
With this test we clearly see how LXC can benefit from setting primarycache
to
metadata
. With KVM on the other end we see little to no benefit.
As you can see in the results, the primarycache
option does have impact on
the performance but not for every workload. In some test, we don't see any
differences ! While in other tests, it provides more than 200% boost.
With all this information, you might be lost about whether it’s good or not to
enable the primarycache and which option is better for you. Here, as a rule of
doom: set all VM and LXC to primarycache=metadata
and for very, very
specific workload, set it to primarycache=all
.
With this settings, your system is not wasting any memmory for the ARC and that memory can be used for something else like more memory for your VM.