In this series we are talking only about performance. Recomendations about configuration, availability or anything non-related with performance is not covered in this series.
Today we start with a series of post about different performance aspects of vSAN. We will cover hardware, network, different configurations, IO behavior etc.
If you have any suggestion or question about these post please reach me in the commentaries and ill try to cover it as well!
The first post will be about probably the main thing that will impact in the performance of the vSAN cluster, the hardware. We will cover each of the components in a vSAN host/cluster and what choices we have for each of them.
vSAN Hardware Components
Disks are connected and controlled by them, they are as important as the disks and we shouldn’t choose them without a proper assesment.
It is important to have in mind a couple of things about controllers:
- NVMe and 3D XPOINT have their own embedded storage controller. These disks are directly connected to the chipset using the PCI Lines. In terms of performance they are the best choise.
- It is recommended to have at least two diskgroups and, if possible, a dedicated storage controller for each one. If we have two diskgroups and a single controller, even if its big enough to manage the load, it will be a single point of failure.
- The most important thing of a controller is the “queue depth”. The minimum supported is 256 but a higher queue depth will give us better performance.
- A low queue depth could impact the rebuild/resyncs times.
- Disk groups contains 1 cache device and between 1 to 7 capacity device, and we can have up to 5 diskgroups. Be sure to have an storage controller with the capacity of handling the necessary load.
- As we are going to disable most of the features of the controller (we will configure them in JBOD or passthrough and we will disable the cache etc) we shouldn’t buy a controller thinking in the functionality. Probably a cheaper controller with a better queue is good choice.
- It is not a good idea mixing loads of vSAN with non-vSAN in the same storage controller. (More info about this in the following KB link)
Best practices when using vSAN and non-vSAN disks with the same storage controller (2129050)
There are two possible vSAN configurations, hybrid or all-flash. In Hybrid configuration flash disks are used as cache and magnetic disks are used for capacity, in all-flash configuration flash devices are used for cache and capacity.
Depending on the vSAN design (Hybrid or all-flash) cache disks have different roles
- In Hybrid deployments the cache disks has two functions, one is to serve as read cache which improves the IO performance with SATA-SAS disks, this way we reduce the amount of reads from magnetic slow disks, second funtion uses the remaining space to work as “Write Cache”.
- In All flash deployments the cache disks is completely used for write cache.
- Capacity Disks stores all data after been written in the cache layer, this process is called “destaging”.
- In Hybrid deployments we try first to read from the cache layer and then if the data is not available we read directly from the capacity disk.
- In All-flash deployments we always read directly from the capacity disk.
Destaging process will be explained in detail in a future post of this series.
There are different tiering classifications for SSD and Magnetic HDD depending on performance or endurance. It is important to take this into consideration not only for Cache disks but also Capacity disk, low tier capacity disk can heavily impact in the performance of the datastore.
SSD are classified in different endurance tiers depending how many writes can do during its life. This is an estimated number based in the vendor’s drive warranty but it’s a good way to know how long the disk will work until it breaks.
It is important to have disks with a high endurance tier for cache layer as it has write intensive usage.
TBW= Terabytes Written
SSD Performance tiers are easy to understand. The classification is just based on the number of writes per second. The tier selection is usually based in the application minimum requirements and the cost as the higher performance tiers are more expensive.
Traditional HDD are cheaper and have more capacity than SSD and also are more reliable but they are drastically slower.
The performance classification for magnetics disks are the “Revolutions per minute” (RPM)
Personally i think with the price drop of NVMe and SSD we had the last years i would only recommend and hybrid vSAN with magnetic disks for Archiving or non-productive environments.
Disk Bandwidth and device Queue
It is also very important the technology of the disk and what kind of interface is used to be connected, SATA, SAS, NVMe don’t have the same bandwidth and queue depth.
In SATA and SAS we have a common storage controller device to manage different physical disks. We need to take into consideration the controller´s queue depth and also the device queue depth of the disks as well as the bandwidth. While ACHI and SATA can handle queue depths of 32 (1 queue and 32 commands) SAS can handle 256 (1 or 2 queues), NVMe can handle queue depths of up to 65K.
SATA disks have a different bandwidth depending the SATA version 1,5Gb/s, 3Gb/s, 6Gb/s, SAS can reach up to 22,5Gb/s. NVMe disks work in a different way as they are connected directly using the PCI lines (usually PCI x4) they have a significant higher bandwidth.
There is a huge difference between the observed latency of a SATA/SAS disks (can be up to 3-4ms) and a NVMe which has very low latency (0,5-0,7ms). This must be taken into consideration depending the application consuming vSAN and also if we are using the disk as capacity or cache.
3D Xpoint (Intel Optane, QuantX) is a NVM disk with ultralow latency and higher speed than regular NVMe. These disks are quite expensive but are really useful for critical applications or high intensive workloads.
We will learn about the IO behavior in a future post dedicated to this topic.
The minimum requirements for vSAN are:
- Dedicated 1Gbps for hybrid configurations.
- Dedicated or shared 10Gbps for all-flash configurations
We need to take this numbers only as a minimum and also we need to think how we are going to design the virtual/physical network, we need to have a network capable of manage the load we require in each case.
Imagine we configure a 10Gbps network shared with other VMs or Services and then we spend a lot of money in a lot of hosts full of expensive disks, this will probably create a bottleneck on the network that will impact not only on vSAN performance but also on the performance of the VMs network or other services.
If the uplinks are mixed, it is recommended to configure them in active-passive configuration where:
- vSAN use one uplink and the second uplink is in “Standby”.
- Second interface will be used for the rest of the services and have the first uplink as “Standby”.
- In case we have a failure in any of those uplinks its a good idea to configure Network IO Control, this way in case all services are in the same uplink because of a failure, the NIOC will priorize vSAN on top of the others.
Example vSAN Hardware Design
Based in my personal experience, we can create a table as an example of what could be the hardware design for different kind of clusters depending the application requirements on them.
As we can see in the table I think it is a good idea to separate applications based on their performance needs. As an example, it’s not a good idea to have critical applications with non-production ones because the hardware required is expensive and wouldn’t be cost-effective.
vSAN Ready Nodes
As you can see, choosing all components for vSAN is not an easy task. We have many options and we need to be sure we don’t create bottlenecks in the different layers and that’s one of the reason why buying vSAN Ready nodes is a good idea.
vSAN Ready nodes are tested with some specific hardware configuration to verify that the combination of hardware works perfectly with the number of diskgroups, the amount of disks with that class/tier category, the model/specifications of the controllers etc.
For more information about the different vSAN Ready nodes:
vSAN Hardware Quick Reference Guide
vSAN ReadyNode Configurator