Using Parallel Operations in PowerShell to Write a Port Scanner

Recently, I've written several scripts that need to perform relatively simple operations on a large set of objects (such as moving a bunch of VMs onto a given Port Group or reconfiguring NTP for a bunch of ESXi hosts).  In general, I approach these challenges by generating a list of all of the objects that I want to manipulate, and then I ForEach my way through that list until I've finished all of my work.

This approach obviously works just fine; it's the way that we'e written scripts for ages.  Just as you might expect from something that's been done the same way for a long time (particularly something IT related...), that's not really the best way to do it any more.  With PowerShell version 3, Microsoft introduced the concept of Parallel operations.  Starting with PowerCLI 6, VMware changed PowerCLI to make it much easier to use with PowerShell Parallel operations.

So, what is a parallel operation?  Well, a simple (and very practical!) example is that ForEach loop.  If I need to manipulate a bunch of VMs from a list, I can ForEach my way through that list and perform my manipulation on each VM, sequentially.  Of course, there's no inherent need for a particular sequence (at that level), but that's just the way that a ForEach works.  It does everything inside the loop on the first object in the list, then does it again on the second object, until it's done it for everything.

Well, PowerShell v3 gives us a new switch on the ForEach loop: -parallel.  When you do a ForEach -Parallel, that instructs PowerShell to execute all iterations of that ForEach loop simultaneously (as limited by resources).  So, instead of waiting for VM1 to be reconfigured, then moving on to VM2, a ForEach -parallel will reconfigure VM1 and VM2 at the same time.

Obviously, this can save an incredible amount of time, as many of those operations do not depend on the completion of previous iterations of those same commands on different objects.  Since this is such a useful technique, I decided that I'd go ahead and just rewrite all of my scripts to use this methodology!  Easy, right?

Yeah, right.  In order to leverage this parallelism, your scripts need to be written in a very specific way.  Firstly, you can't just use a ForEach -parallel loop in a normal script, it has to be in a Workflow.  I'd never heard of a Workflow before learning about this, but it's basically a specialized Function that has the limitations required to enable parallelism (as well as some other cool features).

So, to use parallelism, you need to define a Workflow.  Microsoft has a really good article about Workflows and Parallelism that I highly recommend reading.  It has a great description of how it all works together as well as some easy-to-use examples.

As I learnt more about Workflows and parallelism, I realized that I needed a nice, simple script to mess around with.  So, to that end, I decided to write a PowerShell based Port Scanner.  I figured that this would be an awesome way to demonstrate parallel operations, as pinging 1000+ ports on each of 20 IP addresses sequentially is a terrible situation to imagine.  Performing that giant scan in parallel, on the other hand, is actually feasible (although still not as fast as I'd like...).

workflow port-scan{
  $subnet = $subnet.trim(".")
  foreach -parallel ($thisHost in $hosts){
   $remoteHost = "$subnet." + "$thisHost"
   foreach -parallel ($thisPort in $ports){
    test-netconnection $remoteHost -port $thisPort

The port-scan workflow takes 3 parameters: -ports -subnet and -hosts.  -Ports must be an array of port numbers.  -Subnet must be the class C subnet that the host(s) are on.  -Hosts must be the final octets for the class C IP addresses of the hosts that you wish to ping.

So, if you want to test connectivity on ports 900-999 (maybe you can't remember exactly which port that ESXi hosts uses but need to test connectivity...) on a few ESXi hosts at, you could do it like this: $results = port-scan -ports (900..999) -subnet 192.168.1 -hosts (100..110)

At that point the Workflow fires off.  After a bit of string manipulation to ensure that the subnet is in the expected format, it launches into the parallel loops.  First, it generates a cloud of instances for all of the specified hosts (in this case, - 110).  Within each of those instances, it generates a child cloud of instances, each running the test-netconnection cmdlet on a single port for that host.  When this is executing, you'll notice the yellow host output from the test-netconnection cmdlets coming back in a seemingly random order; that's the nature of parallelism.  That's also why it's handy to store the output in a variable ($results in this case), as you'll probably need to do some sorting to make the output easier for humans to consume.

So, how do you do this with PowerCLI?  Well, that's the next step on my todo list!  But, to that end, LucD has an excellent article about exactly how to use parallelism with PowerCLI that I will certainly be reading as I learn this!


Popular posts from this blog

PowerShell Sorting by Multiple Columns

Deleting Orphaned (AKA Zombie) VMDK Files

Weird Spaces Between Characters in PowerShell Output