Speeding Up your PowerCLI Scripts
So, I post a lot of scripts here, and I'm sure that you can see the progress that I've made as I've learnt more and more about PowerCLI, PowerShell and scripting in general. One of the things that I've recently been considering is how to make my scripts run faster. I've got some scripts that are designed to make lots of changes, like changing the Port Group assignment of every VM in an environment. And they take a long time to run. As in, depending on the size of the environment, several hours. But, they don't necessarily have to take so long to run... I just wasn't clever enough when I wrote them.
There are two major techniques that I'm trying to learn that would seriously speed up those scripts. The first one is parallel execution of For Each loops... I'm still learning about that one, so will write more about it once I've learnt something worth sharing. The other though, is much easier, and can generally be worked into any script with just a little bit of planning. It's reducing the number of get-* statements that you use, which can often be accomplished by storing and re-using results from earlier gets. An individual get statement doesn't take a ton of time, maybe a few seconds if you're returning a lot of objects, but if you're doing those over and over and over, it quickly adds up. Let's look at an example.
If I'm working on a script that needs to move all of my VMs from one set of Port Groups to another, there's a few ways to go about it. At a high level, I need to create a translation table that relates the current port group assignments to the newly created port groups, then go through each NIC on each VM and move them. How you implement that process can have a huge impact on the speed with which it executes. Previously, I would have attacked it like this (warning, psuedo-code ahead):
$TranslationTable = @()
foreach ($thisPG in (get-vdswitch Example-VDSwitch | get-vdportgroup)){
$pgObj = "" | select oldPG,newPG
$pgObj.oldPG = $thisPG.name
$pgObj.newPG = "$($thisPG.name)-new"
$TranslationTable += $pgObj
}
Foreach ($thisVM in (get-vm)){
foreach ($thisNIC in ($thisVM | get-networkadapter)){
$portGroup = get-vdswitch Example-New-VDSwitch | get-vdportgroup -name ($TranslationTable | ? {$_.oldPG -eq $thisNIC.NetworkName}).newPG
$thisNIC | set-networkAdapter -portgroup $portGroup
}
}
This (barring any bugs introduced by me just typing this up from memory...) works in 2 sections. The first section there builds the Translation table, which has two columns: oldPG and newPG. Those are both Strings which specify the names of the old Port Group and the new Port Group.
The second section goes through each VMNIC on each VM, then finds the new Port Group name based on the current NIC configuration. Once it has the new Port Group name, it gets that Port Group and sets the VM to use it. This will all work, but look at how many get-* cmdlets it involves! Every one of those needs to query vCenter for the appropriate object, which means waiting for vCenter's response. Worse, a pair of those get-* statements are inside that nested ForEach, meaning that it will have to perform those lookups once for every NIC on every VM in the environment. Slow, slow, slow!
So, how do we fix that? Well, a little bit of planning can go a long way in this regard. It begins with the translation table. Instead of just storing a bunch of strings, how about we store the actual destination Port Group objects? That's easily accomplished, like this:
$TranslationTable = @()
$newVDSwitchPGs = get-vdswitch Example-New-VDSwitch | get-vdPortGroup
foreach ($thisPG in (get-vdswitch Example-VDSwitch | get-vdportgroup)){
$pgObj = "" | select oldPG,newPG
$pgObj.oldPG = $thisPG.name
$pgObj.newPG = $newVDSwitchPGs | ? {$_.name -eq "$($thisPG.name)-new"}
$TranslationTable += $pgObj
}
Now, I've performed a total of 2 get-VDSwitch cmdlet executions, one to get the source VDSwitch and one to get the new VDSwitch. I've performed 1 get-vdPortGroup cmdlet on each of those vSwitches, storing all of the new Port Groups in an array. When I go to create my table, I just pull the appropriate object from the array based on its name, rather than doing a whole new lookup per Port Group. Also, I'm storing the actual object, not just its name, which will be very useful in the next section.
In the second section, I get to cut out all of the get-* cmdlets that are not directly related to getting the VMs and their NICs, because now I'm working with the actual objects that I collected earlier. Instead of needing to grab those objects over and over again (once for each adapter on each VM), I've grabbed them once and am using those existing objects for each adapter on each VM:
Foreach ($thisVM in (get-vm)){
foreach ($thisNIC in ($thisVM | get-networkadapter)){
$portGroup = ($TranslationTable | ? {$_.oldPG -eq $thisNIC.NetworkName}).newPG
$thisNIC | set-networkAdapter -portgroup $portGroup
}
}
By removing those redundant get-* commands from the earlier implementation, I can dramatically reduce the run time of the script. As a rule of thumb, when I've worked on a script recently, I've been trying to ask myself one question whenever I'm requesting data from vCenter: "have I already requested this object and, if so, do I really need to be requesting it again?"
It could result in something as simple as moving a get-vdswitch cmdlet to the outside of a foreach loop, or it could be more complicated, like storing actual objects and then using those stored references. Either way, it can dramatically improve the execution time of the script while reducing the load on the vCenter server itself, and so is a good habit to get into wherever possible.
There are two major techniques that I'm trying to learn that would seriously speed up those scripts. The first one is parallel execution of For Each loops... I'm still learning about that one, so will write more about it once I've learnt something worth sharing. The other though, is much easier, and can generally be worked into any script with just a little bit of planning. It's reducing the number of get-* statements that you use, which can often be accomplished by storing and re-using results from earlier gets. An individual get statement doesn't take a ton of time, maybe a few seconds if you're returning a lot of objects, but if you're doing those over and over and over, it quickly adds up. Let's look at an example.
If I'm working on a script that needs to move all of my VMs from one set of Port Groups to another, there's a few ways to go about it. At a high level, I need to create a translation table that relates the current port group assignments to the newly created port groups, then go through each NIC on each VM and move them. How you implement that process can have a huge impact on the speed with which it executes. Previously, I would have attacked it like this (warning, psuedo-code ahead):
$TranslationTable = @()
foreach ($thisPG in (get-vdswitch Example-VDSwitch | get-vdportgroup)){
$pgObj = "" | select oldPG,newPG
$pgObj.oldPG = $thisPG.name
$pgObj.newPG = "$($thisPG.name)-new"
$TranslationTable += $pgObj
}
Foreach ($thisVM in (get-vm)){
foreach ($thisNIC in ($thisVM | get-networkadapter)){
$portGroup = get-vdswitch Example-New-VDSwitch | get-vdportgroup -name ($TranslationTable | ? {$_.oldPG -eq $thisNIC.NetworkName}).newPG
$thisNIC | set-networkAdapter -portgroup $portGroup
}
}
This (barring any bugs introduced by me just typing this up from memory...) works in 2 sections. The first section there builds the Translation table, which has two columns: oldPG and newPG. Those are both Strings which specify the names of the old Port Group and the new Port Group.
The second section goes through each VMNIC on each VM, then finds the new Port Group name based on the current NIC configuration. Once it has the new Port Group name, it gets that Port Group and sets the VM to use it. This will all work, but look at how many get-* cmdlets it involves! Every one of those needs to query vCenter for the appropriate object, which means waiting for vCenter's response. Worse, a pair of those get-* statements are inside that nested ForEach, meaning that it will have to perform those lookups once for every NIC on every VM in the environment. Slow, slow, slow!
So, how do we fix that? Well, a little bit of planning can go a long way in this regard. It begins with the translation table. Instead of just storing a bunch of strings, how about we store the actual destination Port Group objects? That's easily accomplished, like this:
$TranslationTable = @()
$newVDSwitchPGs = get-vdswitch Example-New-VDSwitch | get-vdPortGroup
foreach ($thisPG in (get-vdswitch Example-VDSwitch | get-vdportgroup)){
$pgObj = "" | select oldPG,newPG
$pgObj.oldPG = $thisPG.name
$pgObj.newPG = $newVDSwitchPGs | ? {$_.name -eq "$($thisPG.name)-new"}
$TranslationTable += $pgObj
}
Now, I've performed a total of 2 get-VDSwitch cmdlet executions, one to get the source VDSwitch and one to get the new VDSwitch. I've performed 1 get-vdPortGroup cmdlet on each of those vSwitches, storing all of the new Port Groups in an array. When I go to create my table, I just pull the appropriate object from the array based on its name, rather than doing a whole new lookup per Port Group. Also, I'm storing the actual object, not just its name, which will be very useful in the next section.
In the second section, I get to cut out all of the get-* cmdlets that are not directly related to getting the VMs and their NICs, because now I'm working with the actual objects that I collected earlier. Instead of needing to grab those objects over and over again (once for each adapter on each VM), I've grabbed them once and am using those existing objects for each adapter on each VM:
Foreach ($thisVM in (get-vm)){
foreach ($thisNIC in ($thisVM | get-networkadapter)){
$portGroup = ($TranslationTable | ? {$_.oldPG -eq $thisNIC.NetworkName}).newPG
$thisNIC | set-networkAdapter -portgroup $portGroup
}
}
By removing those redundant get-* commands from the earlier implementation, I can dramatically reduce the run time of the script. As a rule of thumb, when I've worked on a script recently, I've been trying to ask myself one question whenever I'm requesting data from vCenter: "have I already requested this object and, if so, do I really need to be requesting it again?"
It could result in something as simple as moving a get-vdswitch cmdlet to the outside of a foreach loop, or it could be more complicated, like storing actual objects and then using those stored references. Either way, it can dramatically improve the execution time of the script while reducing the load on the vCenter server itself, and so is a good habit to get into wherever possible.
Comments
Post a Comment
Sorry guys, I've been getting a lot of spam recently, so I've had to turn on comment moderation. I'll do my best to moderate them swiftly after they're submitted,