Moving Clustered VMs with Shared Physical Mode RDMs

This is probably one of those articles that's only going to apply to a tiny percentage of people within an already miniscule niche subset of a small population... but I'm proud of my work and so am going to post it here anyway.  One of my customers needs to move a bunch of VMs off of one SAN onto another.  Storage vMotion for the win, end of story, right?  Yes*

* 99% of the problem is absolutely solved with Storage vMotion, but I'm not in the business of leaving 1% unfinished.  In this case, that 1% was a bunch of older SQL Servers, set up in 2 node pairs using Microsoft Clustering Services via shared Physical Mode RDMs.  Yikes.

In theory, this process isn't too bad.  Just record the vital statistics about the RDMs, then detach them from the VMs, move the VMs (during an outage window, obviously), create new RDMs using the recorded data, and power everything back up.  This process depends on the new-harddisk cmdlet... but given that I'm writing about it here, you've probably figured out that it wasn't quite so simple.  It turns out that new-harddisk has a very niche flaw: while you can absolutely use it to attach a physical mode RDM to a VM and you can use it to attach a pre-existing hard disk to a VM, you cannot use it to attach a pre-existing RDM to a VM. 

It just bombs out, and it looks like I'm not the first person to run into this issue.  Fortunately for me, LucD posted a great work-around that uses a VirtualMachineConfigSpec to reconfigure the VM and attach the required VMDK.  I took his work and, as so often happens while scripting, expanded on it to fit my particular use case.

We changed our procedure to the following:
1) Cold Storage Migrate the "A" side of the SQL servers onto the new datastores
2) Remove all RDMs from the "B" side SQL servers and cold migrate them onto the new datastores
3) Use a script to clone the RDM configuration from the "A" side server to the corresponding "B" side server.

And step 3 is what I'm writing about here.  I ended up writing several RDM functions, but at the end of the day, there were only 2 that I thought were really valuable: get-RDMData and copy-RDMConfig

Get-RDMData takes an array of VMs as input and generates a big old table that lists (hopefully) all of the information that you could ever need to know about a given RDM.  I used it like get-RDMData -vms (get-vm) to report on all of the RDMs in the whole environment, in order to ensure that I wouldn't run up against any surprises (even after determining that my actual migration method wouldn't require this data as inputs).

copy-RDMConfig is the script that I put together to do that step 3.  It expects VM Objects for its 2 parameters: -sourceVM and -destVM.  The script simply grabs all of the shared SCSI adapters on -sourceVM and then creates new SCSI adapters with those same VMDKs on -destVM.  I used it like copy-RDMConfig -sourceVM (get-vm sqlA) -destVM (get-vm sqlB).

The copy script pretty much just uses LucD's VirtualMachineConfigSpec method, I've just wrapped some loops around it to create an arbitrary number of hard disks on an arbitrary number of SCSI Controllers.  None of my systems had more than 1 Shared SCSI Controller, but I think this will work regardless of the number (as it commits each ConfigSpec per controller, which causes the .key values to be assigned valid values for the VM before moving onto the next instance).

As always, this script is provided as is with no guarantees; just because it worked for me, that does not mean that it'll work for you.  Test thoroughly, and make sure that you fully understand what any script does before you execute it.

Comments

Popular posts from this blog

Clone a Standard vSwitch from one ESXi Host to Another

PowerShell Sorting by Multiple Columns

Deleting Orphaned (AKA Zombie) VMDK Files