Storage Caching and VDI

I recently had the chance to do some testing with PernixData’s vSphere solution and it was very educational.  It worked great, doing exactly what it was supposed to do… and it also got me thinking about the nature of storage caching.

For just about ever, caching has been the technique that allows storage devices to provide the blazing performance that we all demand.  Monolithic SANs always have some amount of cache on their controllers.  What exactly does that cache do?  As the name implies, it caches data.  On those SANs, we would typically assign some amount of read cache and some amount of write cache (I’ve typically biased my vSphere storage devices heavily towards write cache, but opinions vary).  When a write request comes in, it is very quickly written to that cache and the acknowledgement is sent to the device that is performing the write.  The SAN’s job is then to destage the data from the cache onto the disks for long term storage, which only happens as fast as those disks can write data.

The write cache is like an IOPS loan from the disk system.  You can perform 10,000 write operations in one second, even though the disks are only capable of 2,500 write operations per second.  That means that the SAN is going to be destaging data from the cache for 4 seconds.  You’ve effectively borrowed 3 additional seconds worth of writes from the disk device; they must still occur and the device will still be busily writing data for 3 seconds after the write operation has seemingly completed. 

This is great!  Storage access is generally spiky; some system needs to read or write a bunch of data and it wants those operations completed ASAP.  The cache allows this to happen, ‘completing’ that storage operation ASAP so that the system can then go about its business analyzing that data or doing whatever it needs to do.  Once that operation is completed, there’s probably going to be some amount of “idle” time… time that the SAN can use to get caught up and destage that data that’s sitting in its cache.  Cache solutions don’t create IOPS out of thin air, they allow you to use future IOPS now.

When we’re planning a VDI solution, we have awesome tools that give us tons of information about the resource requirements of our target desktops.  One of the vital statistics is IOPS.  If we find that the average desktop uses, on average, 20 IOPS while users are logged in, we can extrapolate out and say that 1,000 such users will need a storage system that supports at least 20,000 IOPS.  At this point, anyone who has planned a VDI solution is shaking their head, as planning for the average is a recipe for disaster.

If storage use in general is spiky, VDI storage use is like a mountain range.  A volcanic one.  Things like powering on desktops, refreshing desktops, even processing logins – they all consume very large amounts of IOPS… and then they complete their task and basically sit idle.  I think that this is where cache solutions can have an immense impact on a VDI solution.  By having a pool of very low latency cache available to the hypervisor, the environment can deal with those IOPS volcanoes, effectively regularizing the IOPS spikes over time rather than requiring that an immense volume of storage device IOPS be available and unused except for those large spike events.


Popular posts from this blog

Deleting Orphaned (AKA Zombie) VMDK Files

Clone a Standard vSwitch from one ESXi Host to Another

vCenter Server Appliance Crash due to Full /Storage/SEAT Partition